The sidecar proxy separates cross-cutting operational concerns from business logic, handing off to a control pane to do the rest
“Service mesh” architecture is about microservices applications working within a “control plane” a standard way to hand-off service-to-service access control authentication, encrypted communications, monitoring, logging, timeout handling, load balancing, health checks, and other operational cross-cutting concerns to a sidecar proxy within its pod, which works with a control plane common to all services.
Decentralized microservices apps Dockerized for running in containers make all network communication through its sidecar proxy (like handing a box to UPS to deliver).
Each sidecar proxy can communicate with backends in different zones (generically named A, B, and C in the diagram). Policies sent to each sidecar can specify zones and a different amount of traffic be sent to each zone. Each zone can be in different clouds (thus multi-cloud).
This approach also enables security-related policies to be applied (such as limiting outflows from some countries) and detection of zonal failure and automatic rerouting around traffic anomalies (including DDoS attacks).
Each sidecar proxy and backend service report periodic state to the global load balancer (GLB) so it can make decisions that take into account latency, cost, load, current failures, etc. This enables centralized visualizations engineers use to understand and operate the entire distributed system in context.
This means app developers no longer need to bother coding for a long list of operational cross-cutting concerns:
collection and reporting of telemetry (health checks, logs, metrics, traces)
- TLS termination (SSH key handling)
Handle protocols HTTP/2, WebSocket, gRPC, Redis, as well as TCP traffic
- rate limiting (DoS mitigation)
- timeout and back-out handling when response is not received
- Fault injection (for chaos engineering to improve reliability)
Enforce policy decisions
- load balancing
- Staged rollouts with percentage-based traffic splits
Embedding the above functionality in each app program may provide the best performance and scalability, but requires polyglot coding to implement the library in many languages. It can also be cumbersome to coordinate upgrades of new versions of each library across all services.
Several sidecar programs have been created:
Logically, communication of packets/requests travel through a “Data Plane”.
There is also a “Control Plane” which, rather than exchanging packets/requests, traffic in policies and configuration settings to enable services such as:
- deploy control (blue/green and/or traffic shifting),
- authentication and authorization settings,
- route table specification (e.g., when service A requests /foo what happens), and
- load balancer settings (e.g., timeouts, retries, circuit breakers, etc.).
The control plane aggregates telemetry data for display on dashboards such as the hero image above.
Individual apps interact with a proxy (Kubernetes sidecar) running on each service instance. The sidecars communicate with a Control Tower. This out-of-process architecture puts hard stuff in one place and allows app developers to focus on business logic. And a separate library in each language for operational concerns is not needed.
The “Control Plane” is a traffic controller that handles tracing, monitoring, logging, alerting, A/B testing, rolling deploys, canary deploys, rate limiting, and retry / circuit-breaker activities that include creation of new instances based on application-wide policies during authentication, and authorization;
The control plane includes an application programming interface, a command‑line interface, and a graphical user interface for managing the app.
Within a Service Mesh, apps create service instances from service definitions (templates) for service instances. Thus, the term service refers to both instance definitions and the instances themselves.
Several products provide a “control plane UI” (web portal/CLI) to set global system configuration settings and policies as well as
- Dynamic service discovery
- certificate management (acts as a Certificate Authority (CA) and generates certificates to allow secure mTLS communication in the data plane).
- automatic self-healing and zone failover (to maximize uptime)
Control Plane vendors
Several control plane vendors compete on features, configurability, extensibility, and usability:
open-sourced Nelson uses Envoy as its proxy and builds a robust service mesh control plane around the HashiCorp stack (i.e. Nomad, etc.).
SmartStack creates a control plane using HAProxy or NGINX.
Cloud Foundry Spring Cloud?
https://istio.io/ aims to provide a a uniform way to secure, connect, and monitor microservices. It provides rich automatic tracing, monitoring, and logging of all services to a “service mesh” – the network of microservices.
Istio provides APIs that let it integrate into any logging platform, or telemetry or policy system.
Istio makes it easy to create a network of deployed services with load balancing, service-to-service authentication, monitoring, and more.
“Without any changes in service code” applies only if the app has not implemented its own mechanism duplicative of Istio, like retry logic (which can bring a system down without attenuation mechanisms).
gRPC is a high-performance, open-source universal RPC framework built on top of HTTP/2 to enable streaming between client and server.
It originated as project “stubby” within Google and is now a F/OSS project with open specs.
- Clients open one long-lived connection to a grpc server
- A new HTTP/2 stream for each RPC call
gRPC avoids mistakes of SOAP WSDL:
- Protobuf vs. XML
- https://www.youtube.com/watch?v=RoXT_Rkg8LA by Twilio
- https://github.com/salesforce/reactive-grpc Lyft Envoy uses gRPC bridge to unlock Python gevent clients.
- https://www.youtube.com/watch?v=hNFM2pDGwKI Introduction to gRPC: A general RPC framework that puts mobile and HTTP/2 first (M.Atamel, R.Tsang)
Envoy (from Lyft)
Envoy provides robust APIs for dynamically managing its configuration.
Envoy is container-aware of Docker.
H2 on both sides, supports gRPC.
Does shadowing (fork traffic to a test cluster for live perf testing)
Envoy is written in C++11.
VIDEO: Lyft’s Envoy: From Monolith to Service Mesh Feb 14, 2017 by Matt Klein (Lyft) explains from a developer’s viewpoint why SoA and its issues.
L7 reverse proxy at edge (replacement for NGINX).
Lyft uses LightStep for tracing, WaveFront for stats (via statsd).
- Twitter: @EnvoyProxy
NGINX built the equivalent of Istio Envoy.
It was built in the Rust programming language.
Linkerd provides Grafana dashboards and CLI debugging tools for Kubernetes service with no cluster-wide installation:
Its customers include Salesforce, Walmart, PayPal, Expedia, Comcast.
- https://linkerd.io/2/getting-started/ for installation, etc.
The circuit breaker pattern isolates unhealthy instances, then gradually brings them back into the healthy instance pool if warranted.
There is a quite thorough hands-on workshop using GKE (Google Kubernetes Engine).
https://github.com/retroryan/istio-workshop is the original worked on by Ryan, etc. contains exercises.
In this workshop, you’ll learn how to install and configure Istio, an open source framework for connecting, securing, and managing microservices, on Google Kubernetes Engine, Google’s hosted Kubernetes product. You will also deploy an Istio-enabled multi-service application
- https://github.com/jamesward/istio-workshop from Nov 2017 is a whole workshop with code.
Install prereqs and Istio
- See my tutorial to Install Git and other utilities
- See my tutorial to Install Kubernetes (minikube)
- See my tutorial to Install Helm 3.4.2-catalina
See my tutorial to Bring up Terminal
Rather than download Istio from https://github.com/istio/istio/releases
brew install istioctl
Verify installed version:
If your pods have not already been setup:
unable to retrieve Pods: Get "https://127.0.0.1:32768/api/v1/namespaces/istio-system/pods?fieldSelector=status.phase%3DRunning&labelSelector=app%3Distiod": dial tcp 127.0.0.1:32768: connect: connection refused
See my tutorial to Start minikube with Docker driver
kubectl apply -f install/kubernetes/helm/helm-service-account.yaml helm init --upgrade --service-account tiller helm install install/kubernetes/helm/istio --name istio \ --namespace istio-system \ --set gateways.istio-ingressgateway.type=NodePort \ --set gateways.istio-egressgateway.type=NodePort \ --set sidecarInjectorWebhook.enabled=false \ --set global.mtls.enabled=false \ --set tracing.enabled=true
Verifying Istio is meshing
To enable Istio by default for resources deployed into the environment, “label” the namespace to enable auto-injection into:
First, clean up the hostname environment because we previously disabled automatic injection of the Istio proxy for the environment where we wish to transition an application to Istio, or one where multiple application environments may exist, all of which may not use a service mesh.
kubectl delete -f hostname.yaml
Reset the istio relase to include the auto-injection webhook:
helm upgrade istio install/kubernetes/helm/istio \ --namespace istio-system \ --set gateways.istio-ingressgateway.type=NodePort \ --set gateways.istio-egressgateway.type=NodePort \ --set sidecarInjectorWebhook.enabled=true \ --set global.mtls.enabled=false \ --set tracing.enabled=true
Label the namespace with the “istio-injection” key:
kubectl label namespace default istio-injection=enabled
Re-install the hostname.yaml app and we should see that the sidecar is automatically injected:
kubectl apply -f hostname.yaml kubectl get pods -l app=hostname
Use fortio for load testing a command tool written in Golang.
./fortio load -c 3 -n 20 -qps 0 http://hostname/version
qps = Queries Per Second
List processing stats:
Blueperf - the Public Cloud Environment Performance Analysis Application containing (Java) microservices application for fictional Airline Company “Acme Air”. By Joe McClure at IBM, using IBM Cloud Kubernetes Service (IKS).
- acmeair-mainservice-java (GUI)
- acmeair-authservice-java JWT (Set environment variable SECURE_SERVICE_CALLS = false to disable authentication.)
- acmeair-bookingservice-java handles getting, making, and cancelling flight bookings.
- acmeair-flightservice-java queries flights and reward miles.
- acmeair-driver - a workload driver for the Acme Air Sample Application.
Isotope is a synthetic app with configurable topology.
“Kubernetes: Service Mesh with Istio” released 2018 by Robert Starmer (of Kumulus) is part of the “Master Cloud-Native Infrastructure with Kubernetes” Learning Path on LinkedIn. Jaeger Operator is covered.
The course uses Istio-1.0.2 and Helm within minkube on a OSX machine.
- Adding Istio to a microservice
- Traffic routing and deployment
- Creating advanced route rules with Istio
- Modifying routes for Canary deployments
- Establishing MTLS credentials
- Connecting to non-MTLS services
- Connecting Istio to OpenTracing
- Improving microservice robustness
- Forcing aborts in specific applications - done by Istio recognizing cookie headers as triggers.
Ray Tsang (@saturnism, saturnism.me), Google Cloud Platform Developer Advocate in NYC:
- Making Microservices Micro with Istio Service Mesh Nov 10, 2017 at Devoxx
- Istio and Kubernetes (conversation)
What is a service mesh? May 27, 2018 by Defog Tech
What is a service mesh? May 27, 2018 by Defog Tech