Wilson Mar bio photo

Wilson Mar

Hello!

Email me Calendar Skype call

LinkedIn Twitter Gitter Instagram Youtube

Github Stackoverflow Pinterest

The sidecar proxy separates cross-cutting operational concerns from business logic, handing off to a control pane to do the rest

US (English)   Español (Spanish)   Français (French)   Deutsch (German)   Italiano   Português   Cyrillic Russian   中文 (简体) Chinese (Simplified)   日本語 Japanese   한국어 Korean

Overview

“Service mesh” architecture is about microservices applications working within a “control plane” a standard way to hand-off service-to-service access control authentication, encrypted communications, monitoring, logging, timeout handling, load balancing, health checks, and other operational cross-cutting concerns to a sidecar proxy within its pod, which works with a control plane common to all services.

Decentralized microservices apps Dockerized for running in containers make all network communication through its sidecar proxy (like handing a box to UPS to deliver).

BLOG: loadbal-sidecar-700x324

Each sidecar proxy can communicate with backends in different zones (generically named A, B, and C in the diagram). Policies sent to each sidecar can specify zones and a different amount of traffic be sent to each zone. Each zone can be in different clouds (thus multi-cloud).

This approach also enables security-related policies to be applied (such as limiting outflows from some countries) and detection of zonal failure and automatic rerouting around traffic anomalies (including DDoS attacks).

Each sidecar proxy and backend service report periodic state to the global load balancer (GLB) so it can make decisions that take into account latency, cost, load, current failures, etc. This enables centralized visualizations engineers use to understand and operate the entire distributed system in context.

This means app developers no longer need to bother coding for a long list of operational cross-cutting concerns:

  • collection and reporting of telemetry (health checks, logs, metrics, traces)

  • TLS termination (SSH key handling)
  • Handle protocols HTTP/2, WebSocket, gRPC, Redis, as well as TCP traffic

  • rate limiting (DoS mitigation)
  • timeout and back-out handling when response is not received
  • Circuit breakers

  • Fault injection (for chaos engineering to improve reliability)
  • Enforce policy decisions

  • load balancing
  • Staged rollouts with percentage-based traffic splits

Embedding the above functionality in each app program may provide the best performance and scalability, but requires polyglot coding to implement the library in many languages. It can also be cumbersome to coordinate upgrades of new versions of each library across all services.

Several sidecar programs have been created:

Logically, communication of packets/requests travel through a “Data Plane”.

Control Plane

There is also a “Control Plane” which, rather than exchanging packets/requests, traffic in policies and configuration settings to enable services such as:

  • deploy control (blue/green and/or traffic shifting),
  • authentication and authorization settings,
  • route table specification (e.g., when service A requests /foo what happens), and
  • load balancer settings (e.g., timeouts, retries, circuit breakers, etc.).

The control plane aggregates telemetry data for display on dashboards such as the hero image above.

svcmesh-v01-810x576.png

Individual apps interact with a proxy (Kubernetes sidecar) running on each service instance. The sidecars communicate with a Control Tower. This out-of-process architecture puts hard stuff in one place and allows app developers to focus on business logic. And a separate library in each language for operational concerns is not needed.

The “Control Plane” is a traffic controller that handles tracing, monitoring, logging, alerting, A/B testing, rolling deploys, canary deploys, rate limiting, and retry / circuit-breaker activities that include creation of new instances based on application-wide policies during authentication, and authorization;

The control plane includes an application programming interface, a command‑line interface, and a graphical user interface for managing the app.

Within a Service Mesh, apps create service instances from service definitions (templates) for service instances. Thus, the term service refers to both instance definitions and the instances themselves.

Several products provide a “control plane UI” (web portal/CLI) to set global system configuration settings and policies as well as

  • Dynamic service discovery
  • certificate management (acts as a Certificate Authority (CA) and generates certificates to allow secure mTLS communication in the data plane).
  • automatic self-healing and zone failover (to maximize uptime)

Control Plane vendors

Several control plane vendors compete on features, configurability, extensibility, and usability:

  • IstioD, backed by Google, IBM, and Lyft (which contributed its Envoy proxy that works within Kubernetes as a sidecar proxy instance)

  • open-sourced Nelson uses Envoy as its proxy and builds a robust service mesh control plane around the HashiCorp stack (i.e. Nomad, etc.).

  • Kong Mesh, the licensed side of open-sourced kuma.io donated to CNCF. It’s built on top of Envoy.

  • SmartStack creates a control plane using HAProxy or NGINX.

  • NGINX proxy

Cloud Foundry Spring Cloud?

Istio

istio-logo-151x201-32530.png

https://istio.io/ aims to provide a a uniform way to secure, connect, and monitor microservices. It provides rich automatic tracing, monitoring, and logging of all services to a “service mesh” – the network of microservices.

https://istio.io/docs/reference/config/

Istio provides APIs that let it integrate into any logging platform, or telemetry or policy system.

Istio makes it easy to create a network of deployed services with load balancing, service-to-service authentication, monitoring, and more.

“Without any changes in service code” applies only if the app has not implemented its own mechanism duplicative of Istio, like retry logic (which can bring a system down without attenuation mechanisms).

gRPC

gRPC is a high-performance, open-source universal RPC framework built on top of HTTP/2 to enable streaming between client and server.

It originated as project “stubby” within Google and is now a F/OSS project with open specs.

https://grpc.io/blog/principles:

  • Clients open one long-lived connection to a grpc server
  • A new HTTP/2 stream for each RPC call

gRPC avoids mistakes of SOAP WSDL:

  • Protobuf vs. XML

References:

  • https://www.youtube.com/watch?v=RoXT_Rkg8LA by Twilio
  • https://github.com/salesforce/reactive-grpc Lyft Envoy uses gRPC bridge to unlock Python gevent clients.
  • https://www.youtube.com/watch?v=hNFM2pDGwKI Introduction to gRPC: A general RPC framework that puts mobile and HTTP/2 first (M.Atamel, R.Tsang)

Envoy (from Lyft)

Envoy provides robust APIs for dynamically managing its configuration.

Envoy is container-aware of Docker.

H2 on both sides, supports gRPC.

Does shadowing (fork traffic to a test cluster for live perf testing)

Envoy is written in C++11.

VIDEO: Lyft’s Envoy: From Monolith to Service Mesh Feb 14, 2017 by Matt Klein (Lyft) explains from a developer’s viewpoint why SoA and its issues.

L7 reverse proxy at edge (replacement for NGINX).

Lyft uses LightStep for tracing, WaveFront for stats (via statsd).

References:

NGINX

NGINX proxy

NGINX built the equivalent of Istio Envoy.

https://www.nginx.com/blog/what-is-a-service-mesh/

https://www.nginx.com/blog/introducing-the-nginx-microservices-reference-architecture/

Linkerd

Linkerd (https://linkerd.io) is a Cloud Native Foundation (CNF) incubating project that also includes graduates Kubernetes and Prometheus, plus Helm, OpenTracing, gRPC, etc..

It was built in the Rust programming language.

Linkerd provides Grafana dashboards and CLI debugging tools for Kubernetes service with no cluster-wide installation: svcmesh-linkerd-dataplane-grafana-1570x462

Its customers include Salesforce, Walmart, PayPal, Expedia, Comcast.

References:

  • https://linkerd.io/2/getting-started/ for installation, etc.

Patterns

Circuit breaker

The circuit breaker pattern isolates unhealthy instances, then gradually brings them back into the healthy instance pool if warranted.


Workshop

There is a quite thorough hands-on workshop using GKE (Google Kubernetes Engine).

  • https://github.com/retroryan/istio-workshop is the original worked on by Ryan, etc. contains exercises.

  • https://github.com/srinandan/istio-workshop

In this workshop, you’ll learn how to install and configure Istio, an open source framework for connecting, securing, and managing microservices, on Google Kubernetes Engine, Google’s hosted Kubernetes product. You will also deploy an Istio-enabled multi-service application


Install prereqs and Istio

  1. See my tutorial to Install Git and other utilities
  2. See my tutorial to Install Kubernetes (minikube)
  3. See my tutorial to Install Helm 3.4.2-catalina
  4. See my tutorial to Bring up Terminal

  5. Rather than download Istio from https://github.com/istio/istio/releases

    brew install istioctl
  6. Verify installed version:

    istioctl version

    If your pods have not already been setup:

    unable to retrieve Pods: Get "https://127.0.0.1:32768/api/v1/namespaces/istio-system/pods?fieldSelector=status.phase%3DRunning&labelSelector=app%3Distiod": dial tcp 127.0.0.1:32768: connect: connection refused
    
  7. See my tutorial to Start minikube with Docker driver

  8. Start

    kubectl apply -f install/kubernetes/helm/helm-service-account.yaml
    helm init --upgrade --service-account tiller
    helm install install/kubernetes/helm/istio --name istio \
      --namespace istio-system \
      --set gateways.istio-ingressgateway.type=NodePort \
      --set gateways.istio-egressgateway.type=NodePort \
      --set sidecarInjectorWebhook.enabled=false \
      --set global.mtls.enabled=false \
      --set tracing.enabled=true
    

Verifying Istio is meshing

VIDEO

To enable Istio by default for resources deployed into the environment, “label” the namespace to enable auto-injection into:

  1. First, clean up the hostname environment because we previously disabled automatic injection of the Istio proxy for the environment where we wish to transition an application to Istio, or one where multiple application environments may exist, all of which may not use a service mesh.

    kubectl delete -f hostname.yaml
    
  2. Reset the istio relase to include the auto-injection webhook:

    helm upgrade istio install/kubernetes/helm/istio \
      --namespace istio-system \
      --set gateways.istio-ingressgateway.type=NodePort \
      --set gateways.istio-egressgateway.type=NodePort \
      --set sidecarInjectorWebhook.enabled=true \
      --set global.mtls.enabled=false \
      --set tracing.enabled=true
    
  3. Label the namespace with the “istio-injection” key:

    kubectl label namespace default istio-injection=enabled
    
  4. Re-install the hostname.yaml app and we should see that the sidecar is automatically injected:

    kubectl apply -f hostname.yaml
    kubectl get pods -l app=hostname
    
  5. Use fortio for load testing a command tool written in Golang.

    ./fortio load -c 3 -n 20 -qps 0 http://hostname/version

    qps = Queries Per Second

  6. List processing stats:

    ./fortio-faults

Sample Apps

Blueperf - the Public Cloud Environment Performance Analysis Application containing (Java) microservices application for fictional Airline Company “Acme Air”. By Joe McClure at IBM, using IBM Cloud Kubernetes Service (IKS).

  • acmeair-mainservice-java (GUI)
  • acmeair-authservice-java JWT (Set environment variable SECURE_SERVICE_CALLS = false to disable authentication.)
  • acmeair-bookingservice-java handles getting, making, and cancelling flight bookings.
  • acmeair-customerservice-java
  • acmeair-flightservice-java queries flights and reward miles.
  • acmeair-driver - a workload driver for the Acme Air Sample Application.

Isotope is a synthetic app with configurable topology.

istio-sample-app-1752x874 [1]

[1]“Kubernetes: Service Mesh with Istio” released 2018 by Robert Starmer (of Kumulus) is part of the “Master Cloud-Native Infrastructure with Kubernetes” Learning Path on LinkedIn. Jaeger Operator is covered.

The course uses Istio-1.0.2 and Helm within minkube on a OSX machine.

  • Adding Istio to a microservice
  • Traffic routing and deployment
  • Creating advanced route rules with Istio
  • Modifying routes for Canary deployments
  • Establishing MTLS credentials
  • Connecting to non-MTLS services
  • Connecting Istio to OpenTracing
  • Improving microservice robustness
  • Forcing aborts in specific applications - done by Istio recognizing cookie headers as triggers.

Rock Stars

Ray Tsang (@saturnism, saturnism.me), Google Cloud Platform Developer Advocate in NYC:

Kelsey Hightower:

References

BOOK: Istio Succinctly 2020

What is a service mesh? May 27, 2018 by Defog Tech

What is a service mesh? May 27, 2018 by Defog Tech

Microservices in the Cloud with Kubernetes and Istio (Google I/O ‘18) May 9, 2018 by Sandeep Dinesh

APIs, Microservices, and the Service Mesh (Cloud Next ‘19) by Dino Chiesa

Social

https://www.instagram.com/explore/tags/servicemesh/