Unquestionably, Kubernetes has quickly become the leading Container-as-a-Service (CaaS) platform. In late September 2017, Rancher Labs announced the release of Rancher 2.0, based on Kubernetes. In mid-October, at DockerCon Europe 2017, Docker announced they were integrating Kubernetes into the Docker platform. In late October, Microsoft released the public preview of Managed Kubernetes for Azure Container Service (AKS). In November, Google officially renamed its Google Container Engine to Google Kubernetes Engine. Most recently, at AWS re:Invent 2017, Amazon announced its own manged version of Kubernetes, Amazon Elastic Container Service for Kubernetes (Amazon EKS).
The recent abundance of Kuberentes-based CaaS offerings makes deploying, scaling, and managing modern distributed applications increasingly easier. However, as Craig McLuckie, CEO of Heptio, recently stated, “…it doesn’t matter who is delivering Kubernetes, what matters is how it runs.” Making Kubernetes run better is the goal of a new generation of tools, such as Istio, Envoy, Project Calico, Helm, and Ambassador.
What is Istio?
One of those new tools and the subject of this post is Istio. Released in Alpha by Google, IBM and Lyft, in May 2017, Istio is an open platform to connect, manage, and secure microservices. Istio describes itself as, “…an easy way to create a network of deployed services with load balancing, service-to-service authentication, monitoring, and more, without requiring any changes in service code. You add Istio support to services by deploying a special sidecar proxy throughout your environment that intercepts all network communication between microservices, configured and managed using Istio’s control plane functionality.”
Istio contains several components, split between the data plane and a control plane. The data plane includes the Istio Proxy (an extended version of Envoy proxy). The control plane includes the Istio Mixer, Istio Pilot, and Istio-Auth. The Istio components work together to provide behavioral insights and operational control over a microservice-based service mesh. Istio describes a service mesh as a “transparently injected layer of infrastructure between a service and the network that gives operators the controls they need while freeing developers from having to bake solutions to distributed system problems into their code.”
In this post, we will deploy the latest version of Istio, v0.4.0, on Google Cloud Platform, using the latest version of Google Kubernetes Engine (GKE), 1.8.4-gke.1. Both versions were just released in mid-December, as this post is being written. Google, as you probably know, was the creator of Kubernetes, now an open-source CNCF project. Google was the first Cloud Service Provider (CSP) to offer managed Kubernetes in the Cloud, starting in 2014, with Google Container Engine (GKE), which used Kubernetes. This post will outline the installation of Istio on GKE, as well as the deployment of a sample application, integrated with Istio, to demonstrate Istio’s observability features.
The scripts used in this post are as follows, in order of execution (gist).
Code samples in this post are displayed as Gists, which may not display correctly on some mobile and social media browsers. Links to gists are also provided.
Creating GKE Cluster
First, we create the Google Kubernetes Engine (GKE) cluster. The GKE cluster creation is highly-configurable from either the GCP Cloud Console or from the command line, using the Google Cloud Platform gcloud CLI. The CLI will be used throughout the post. I have chosen to create a highly-available, 3-node cluster (1 node/zone) in GCP’s South Carolina us-east1 region (gist).
Once built, we need to retrieve the cluster’s credentials.
The resulting GKE cluster will have the following characteristics (gist).
With the GKE cluster created, we can now deploy Istio. There are at least two options for deploying Istio on GCP. You may choose to manually install and configure Istio in a GKE cluster, as I will do in this post, following these instructions. Alternatively, you may choose to use the Istio GKE Deployment Manager. This all-in-one GCP service will create your GKE cluster, and install and configure Istio and the Istio add-ons, including their Book Info sample application.
There were a few reasons I chose not to use the Istio GKE Deployment Manager option. First, until very recently, you could not install the latest versions of Istio with this option (as of 12/21 you can now deploy v0.3.0 and v0.4.0). Secondly, currently, you only have the choice of GKE version 1.7.8-gke.0. I wanted to test the latest v1.8.4 release with a stable GA version of RBAC. Thirdly, at least three out of four of my initial attempts to use the Istio GKE Deployment Manager failed during provisioning for unknown reasons. Lastly, you will learn more about GKE, Kubernetes, and Istio by doing it yourself, at least the first time.
Istio Code Changes
Before installing Istio, I had to make several minor code changes to my existing Kubernetes resource files. The requirements are detailed in Istio’s Pod Spec Requirements. These changes are minor, but if missed, cause errors during deployment, which can be hard to identify and resolve.
First, you need to name your Service ports in your Service resource files. More specifically, you need to name your service ports,
http, as shown in the Candidate microservice’s Service resource file, below (note line 10) (gist).
Second, an app label is required for Istio. I added an app label to each Deployment and Service resource file, as shown below in the Candidate microservice’s Deployment resource files (note lines 5 and 6) (gist).
The next set of code changes were to my existing Ingress resource file. The requirements for an Ingress resource using Istio are explained here. The first change, Istio ignores all annotations other than
kubernetes.io/ingress.class: istio (note line 7, below). The next change, if using HTTPS, the secret containing your TLS/SSL certificate and private key must be called
istio-ingress-certs; all other names will be ignored (note line 10, below). Related and critically important, that secret must be deployed to the
istio-system namespace, not the application’s namespace. The last change, for my particular my prefix match routing rules, I needed to change the rules from
/.* is a special Istio notation that is used to indicate a prefix match (note lines 14, 18, and 22, below) (gist).
To install Istio, you first will need to download and uncompress the correct distribution of Istio for your OS. Istio provides instructions for installation on various platforms.
install-istio.sh script contains a variable,
ISTIO_HOME, which should point to the root of your local Istio directory. We will also deploy all the current Istio add-ons, including Prometheus, Grafana, Zipkin, Service Graph, and Zipkin-to-Stackdriver (gist).
Once installed, from the GCP Cloud Console, an alternative to the native Kubernetes Dashboard, we should see the following Istio resources deployed and running. Below, note the three nodes are distributed across three zones within the GCP us-east-1 region, the correct version of GKE is employed, Stackdriver logging and monitoring are enabled, and the Alpha Clusters features are also enabled.
And here, we see the nodes that comprise the GKE cluster.
Below, note the four components that comprise Istio:
istio-pilot. Additionally, note the five components that comprise the Istio add-ons.
Below, observe the Istio Ingress has automatically been assigned a public IP address by GCP, accessible on ports 80 and 443. This IP address is how we will communicate with applications running on our GKE cluster, behind the Istio Ingress Load Balancer. Later, we will see how the Istio Ingress Load Balancer knows how to route incoming traffic to those application endpoints, using the Voter API’s Ingress configuration.
Istio makes ample use of Kubernetes Config Maps and Secrets, to store configuration, and to store certificates for mutual TLS.
Creation of the GKE cluster and deployed Istio to the cluster is complete. Following, I will demonstrate the deployment of the Voter API to the cluster. This will be used to demonstrate the capabilities of Istio on GKE.
In addition to the GCP Cloud Console, the native Kubernetes Dashboard is also available. To open, use the
kubectl proxy command and connect to the Kubernetes Dashboard at
https://127.0.0.1:8001/ui. You should now be able to view and edit all resources, from within the Kubernetes Dashboard.
To demonstrate the functionality of Istio and GKE, I will deploy the Voter API. I have used variations of the sample Voter API application in several previous posts, including Architecting Cloud-Optimized Apps with AKS (Azure’s Managed Kubernetes), Azure Service Bus, and Cosmos DB and Eventual Consistency: Decoupling Microservices with Spring AMQP and RabbitMQ. I suggest reading these two post to better understand the Voter API’s design.
For this post, I have reconfigured the Voter API to use MongoDB’s Atlas Database-as-a-Service (DBaaS) as the NoSQL data-source for each microservice. The Voter API is connected to a MongoDB Atlas 3-node M10 instance cluster in GCP’s us-east1 (South Carolina) region. With Atlas, you have the choice of deploying clusters to GCP or AWS.
The Voter API will use CloudAMQP’s RabbitMQ-as-a-Service for its decoupled, eventually consistent, message-based architecture. For this post, the Voter API is configured to use a RabbitMQ cluster in GCP’s us-east1 (South Carolina) region; I chose a minimally-configured free version of RabbitMQ. CloudAMQP allows you to provide a much more robust multi-node clusters for Production, on GCP or AWS.
CloudAMQP provides access to their own Management UI, in addition to access to RabbitMQ’s Management UI.
With the Voter API running and taking traffic, we can see each Voter API microservice instance, nine replicas in total, connected to RabbitMQ. They are each publishing and consuming messages off the two queues.
The GKE, MongoDB Atlas, and RabbitMQ clusters are all running in the same GCP Region. Optimizing the Voter API cloud architecture on GCP, within a single Region, greatly reduces network latency, increases API performance, and improves end-to-end application and infrastructure observability and traceability.
Installing the Voter API
For simplicity, I have divided the Voter API deployment into three parts. First, we create the new
voter-api Kubernetes Namespace, followed by creating a series of Voter API Kuberentes Secrets (gist).
There are a total of five secrets, one secret for each of the three microservice’s MongoDB databases, one secret for the RabbitMQ connection string (shown below), and one secret containing a Let’s Encrypt SSL/TLS certificate chain and private key for the Voter API’s domain,
api.voter-demo.com (shown below).
Next, we create the microservice pods, using the Kubernetes Deployment files, create three ClusterIP-type Kubernetes Services, and a Kubernetes Ingress. The Ingress contains the service endpoint configuration, which Istio Ingress will use to correctly route incoming external API traffic (gist).
Three Kubernetes Pods for each of the three microservice should be created, for a total of nine pods. In the GCP Cloud UI’s Workloads (Kubernetes Deployments), you should see the following three resources. Note each Workload has three pods, each containing one replica of the microservice.
In the GCP Cloud UI’s Discovery and Load Balancing tab, you should observe the following four resources. Note the Voter API Ingress endpoints for the three microservices, which are used by the Istio Proxy, discussed below.
Examining the Voter API deployment more closely, you will observe that each of the nine Voter API microservice pods have two containers running within them (gist).
Along with the microservice container, there is an Istio Proxy container, commonly referred to as a sidecar container. Istio Proxy is an extended version of the Envoy proxy, Lyfts well-known, highly performant edge and service proxy. The proxy sidecar container is injected automatically when the Voter API pods are created. This is possible because we deployed the Istio Initializer (
istio-initializer.yaml). The Istio Initializer guarantees that Istio Proxy will be automatically injected into every microservice Pod. This is referred to as automatic sidecar injection. Below we see an example of one of three Candidate pods running the istio-proxy sidecar.
In the example above, all traffic to and from the Candidate microservice now passes through the Istio Proxy sidecar. With Istio Proxy, we gain several enterprise-grade features, including enhanced observability, service discovery and load balancing, credential injection, and connection management.
Manual Sidecar Injection
What if we have application components we do not want automatically managed with Istio Proxy. In that case, manual sidecar injection might be preferable to automatic sidecar injection with Istio Initializer. For manual sidecar injection, we execute a
istioctl kube-inject command for each of the Kubernetes Deployments. The command manually injects the Istio Proxy container configuration into the Deployment resource file, alongside each Voter API microservice container. On Mac and Linux, this command is similar to the following. Proxy injection is discussed in detail, here (gist).
External Service Egress
Whether you choose automatic or manual sidecar injection of the Istio Proxy, Istio’s egress rules currently only support HTTP and HTTPS requests. The Voter API makes external calls to its backend services, using two alternate protocols, MongoDB Wire Protocol (
mongodb://) and RabbitMQ AMQP (
amqps://). Since we cannot use an Istio egress rule for either protocol, we will use the
includeIPRanges option with the
istioctl kube-inject command to open egress to the two backend services. This will completely bypass Istio for a specific IP range. You can read more about calling external services directly, on Istio’s website.
You will need to modify the
includeIPRanges argument within the
create-voter-api-part3.sh script, adding your own GKE cluster’s IP ranges to the
IP_RANGES variable. The two IP ranges can be found using the following GCP CLI command (gist).
create-voter-api-part3.sh script also contains a modified version the
istioctl kube-inject command for each Voter API Deployment. Using the modified command, the original Deployment files are not altered, instead, a temporary copy of the Deployment file is created into which Istio injects the required modifications. The temporary Deployment file is then used for the deployment, and then immediately deleted (gist).
Some would argue not having the actual deployed version of the file checked into in source code control is an anti-pattern; in this case, I would disagree. If I need to redeploy, I would just run the
istioctl kube-inject command again. You can always view, edit, and import the deployed YAML file, from the GCP CLI or GKE Management UI.
The amount of Istio configuration injected into each microservice Pod’s Deployment resource file is considerable. The Candidate Deployment file swelled from 68 lines to 276 lines of code! This hints at the power, as well as the complexity of Istio. Shown below is a snippet of the Candidate Deployment YAML, after Istio injection.
Confirming Voter API
Installation of the Voter API is now complete. We can validate the Voter API is working, and that traffic is being routed through Istio, using Postman. Below, we see a list of candidates successfully returned from the Voter microservice, through the Voter API. This means, not only us the API running, but that messages have been successfully passed between the services, using RabbitMQ, and saved to the microservice’s corresponding MongoDB databases.
Below, note the
x-envoy-upstream-service-time response headers. They both confirm the Voter API HTTPS traffic is being managed by Istio.
Observability is certainly one of the primary advantages of implementing Istio. For anyone like myself, who has spent many long and often frustrating hours installing, configuring, and managing monitoring systems for distributed platforms, Istio’s observability features are most welcome. Istio provides Prometheus, Grafana, Zipkin, Service Graph, and Zipkin-to-Stackdriver add-ons. Combined with the monitoring capabilities of Backend-as-a-Service providers, such as MongoDB Altas and CloudAMQP RabvbitMQ, you get considerable visibility into your application, out-of-the-box.
First, we will look at Prometheus, a leading open-source monitoring solution. The easiest way to access the Prometheus UI, or any of the other add-ons, including Prometheus, is using port-forwarding. For example with Prometheus, we use the following command (gist).
Alternatively, you could securely expose any of the Istio add-ons through the Istio Ingress, similar to how the Voter API microservice endpoints are exposed.
Prometheus collects time series metrics from both the Istio and Voter API components. Below we see two examples of typical metrics being collected; they include the 201 responses generated by the Candidate microservice, and the outflow of bytes by the Election microservice, over a given period of time.
Although Prometheus is an excellent monitoring solution, Grafana, the leading open source software for time series analytics, provides a much easier way to visualize the metrics collected by Prometheus. Conveniently, Istio provides a dynamically-configured Grafana Dashboard, which will automatically display metrics for components deployed to GKE.
Below, note the metrics collected for the Candidate and Election microservice replicas. Out-of-the-box, Grafana displays common HTTP KPIs, such as request rate, success rate, response codes, response time, and response size. Based on the version label included in the Deployment resource files, we can delineate metrics collected by the version of the Voter API microservices, in this case, v1 of the Candidate and Election microservices.
Next, we have Zipkin, a leading distributed tracing system.
Since the Voter API application uses RabbitMQ to decouple communications between services, versus direct HTTP-based IPC, we won’t see any complex multi-segment traces. We will only see traces representing traffic to and from the microservices, which passes through the Istio Ingress.
Similar to Zipkin, Service Graph is not as valuable with the Voter API application as it could be with more complex applications. Below is a Service Graph view of the Voter API showing microservice version and requests/second to each microservice.
One last tool we have to monitor our GKE cluster is Stackdriver. Stackdriver provides fine-grain monitoring, logging, and diagnostics. If you recall, we enabled Stackdriver logging and monitoring when we first provisioned the GKE cluster. Stackdrive allows us to examine the performance of the GKE cluster’s resources, review logs, and set alerts.
When we installed Istio, we also installed the Zipkin-to-Stackdriver add-on. The Stackdriver Trace Zipkin Collector is a drop-in replacement for the standard Zipkin HTTP collector that writes to Google’s free Stackdriver Trace distributed tracing service. To use Stackdriver for traces originating from Zipkin, there is additional configuration required, which is commented out of the current version of the
zipkin-to-stackdriver.yaml file (gist).
Instructions to configure the Zipkin-to-Stackdriver feature can be found here. Below is an example of how you might add the necessary configuration using a Kubernetes ConfigMap to inject the required user credentials JSON file (
zipkin-to-stackdriver-creds.json) into the
zipkin-to-stackdriver container. The new configuration can be seen on lines 27-44 (gist).
Istio provides a significant amount of fine-grained management control to Kubernetes. Managed Kubernetes CaaS offerings like GKE, coupled with tools like Istio, will soon make running reliable and secure containerized applications in Production, commonplace.
- Introducing Istio: A robust service mesh for microservices
- IBM Code: What is a Service Mesh and how Istio fits in
- Microservices Patterns With Envoy Sidecar Proxy: The series
- OpenShift: Microservices Patterns with Envoy Sidecar Proxy, Part I: Circuit Breaking
- OpenShift: Microservices Patterns with Envoy Proxy, Part II: Timeouts and Retries
- OpenShift: Microservices Patterns With Envoy Proxy, Part III: Distributed Tracing
- Using Kubernetes Configmap with configuration files…
All opinions in this post are my own, and not necessarily the views of my current or past employers or their clients.