Posts Tagged Kubernetes
In this two-part post, we are exploring the creation of a GKE cluster, replete with the latest version of Istio, often referred to as IoK (Istio on Kubernetes). We will then deploy, perform integration testing, and promote an application across multiple environments within the cluster.
In Part One of this post, we created a Kubernetes cluster on the Google Cloud Platform, installed Istio, provisioned a PostgreSQL database, and configured DNS for routing. Under the assumption that v1 of the Election microservice had already been released to Production, we deployed v1 to each of the three namespaces.
In Part Two of this post, we will learn how to utilize the advanced API testing capabilities of Postman and Newman to ensure v2 is ready for UAT and release to Production. We will deploy and perform integration testing of a new v2 of the Election microservice, locally on Kubernetes Minikube. Once confident v2 is functioning as intended, we will promote and test v2 across the
As a reminder, all source code for this post can be found on GitHub. The project’s README file contains a list of the Election microservice’s endpoints. To get started quickly, use one of the two following options (gist).
Code samples in this post are displayed as Gists, which may not display correctly on some mobile and social media browsers. Links to gists are also provided.
This project includes a kubernetes sub-directory, containing all the Kubernetes resource files and scripts necessary to recreate the example shown in the post.
Testing Locally with Minikube
Deploying to GKE, no matter how automated, takes time and resources, whether those resources are team members or just compute and system resources. Before deploying v2 of the Election service to the non-prod GKE cluster, we should ensure that it has been thoroughly tested locally. Local testing should include the following test criteria:
- Source code builds successfully
- All unit-tests pass
- A new Docker Image can be created from the build artifact
- The Service can be deployed to Kubernetes (Minikube)
- The deployed instance can connect to the database and execute the Liquibase changesets
- The deployed instance passes a minimal set of integration tests
Minikube gives us the ability to quickly iterate and test an application, as well as the Kubernetes and Istio resources required for its operation, before promoting to GKE. These resources include Kubernetes Namespaces, Secrets, Deployments, Services, Route Rules, and Istio Ingresses. Since Minikube is just that, a miniature version of our GKE cluster, we should be able to have a nearly one-to-one parity between the Kubernetes resources we apply locally and those applied to GKE. This post assumes you have the latest version of Minikube installed, and are familiar with its operation.
This project includes a minikube sub-directory, containing all the Kubernetes resource files and scripts necessary to recreate the Minikube deployment example shown in this post. The three included scripts are designed to be easily adapted to a CI/CD DevOps workflow. You may need to modify the scripts to match your environment’s configuration. Note this Minikube-deployed version of the Election service relies on the external Amazon RDS database instance.
Local Database Version
To eliminate the AWS costs, I have included a second, alternate version of the Minikube Kubernetes resource files, minikube_db_local This version deploys a single containerized PostgreSQL database instance to Minikube, as opposed to relying on the external Amazon RDS instance. Be aware, the database does not have persistent storage or an Istio sidecar proxy.
If you do not have a running Minikube cluster, create one with the
minikube start command.
Minikube allows you to use normal
kubectl CLI commands to interact with the Minikube cluster. Using the
kubectl get nodes command, we should see a single Minikube node running the latest Kubernetes v1.10.0.
Istio on Minikube
Next, install Istio following Istio’s online installation instructions. A basic Istio installation on Minikube, without the additional add-ons, should only require a single Istio install script.
If successful, you should observe a new
istio-system namespace, containing the four main Istio components:
Deploy v2 to Minikube
Next, create a Minikube Development environment, consisting of a
dev Namespace, Istio Ingress, and Secret, using the
part1-create-environment.sh script. Next, deploy v2 of the Election service to the
dev Namespace, along with an associated Route Rule, using the
part2-deploy-v2.sh script. One v2 instance should be sufficient to satisfy the testing requirements.
Access to v2 of the Election service on Minikube is a bit different than with GKE. When routing external HTTP requests, there is no load balancer, no external public IP address, and no public DNS or subdomains. To access the single instance of v2 running on Minikube, we use the local IP address of the Minikube cluster, obtained with the
minikube ip command. The access port required is the Node Port (
nodePort) of the
istio-ingress Service. The command is shown below (gist) and included in the
The second part of our HTTP request routing is the same as with GKE, relying on an Istio Route Rules. The
/v2/ sub-collection resource in the HTTP request URL is rewritten and routed to the v2 election Pod by the Route Rule. To confirm v2 of the Election service is running and addressable, curl the
/v2/actuator/health endpoint. Spring Actuator’s
/health endpoint is frequently used at the end of a CI/CD server’s deployment pipeline to confirm success. The Spring Boot application can take a few minutes to fully start up and be responsive to requests, depending on the speed of your local machine.
Using the Kubernetes Dashboard, we should see our deployment of the single Election service Pod is running successfully in Minikube’s
Once deployed, we run a battery of integration tests to confirm that the new v2 functionality is working as intended before deploying to GKE. In the next section of this post, we will explore the process creating and managing Postman Collections and Postman Environments, and how to automate those Collections of tests with Newman and Jenkins.
The typical reason an application is deployed to lower environments, prior to Production, is to perform application testing. Although definitions vary across organizations, testing commonly includes some or all of the following types: Integration Testing, Functional Testing, System Testing, Stress or Load Testing, Performance Testing, Security Testing, Usability Testing, Acceptance Testing, Regression Testing, Alpha and Beta Testing, and End-to-End Testing. Test teams may also refer to other testing forms, such as Whitebox (Glassbox), Blackbox Testing, Smoke, Validation, or Sanity Testing, and Happy Path Testing.
The site, softwaretestinghelp.com, defines integration testing as, ‘testing of all integrated modules to verify the combined functionality after integration is termed so. Modules are typically code modules, individual applications, client and server applications on a network, etc. This type of testing is especially relevant to client/server and distributed systems.’
In this post, we are concerned that our integrated modules are functioning cohesively, primarily the Election service, Amazon RDS database, DNS, Istio Ingress, Route Rules, and the Istio sidecar Proxy. Unlike Unit Testing and Static Code Analysis (SCA), which is done pre-deployment, integration testing requires an application to be deployed and running in an environment.
I have chosen Postman, along with Newman, to execute a Collection of integration tests before promoting to the next environment. The integration tests confirm the deployed application’s name and version. The integration tests then perform a series of HTTP GET, POST, PUT, PATCH, and DELETE actions against the service’s resources. The integration tests verify a successful HTTP response code is returned, based on the type of request made.
/candidates endpoint. We then use the stored candidate ID in proceeding HTTP GET, PUT, and PATCH test requests to the same
Environment-specific variables, such as the resource host, port, and environment sub-collection resource, are abstracted and stored as key/value pairs within Postman Environments, and called through variables in the request URL and within the tests. Thus, the same Postman Collection of tests may be run against multiple environments using different Postman Environments.
Postman Runner allows us to run multiple iterations of our Collection. We also have the option to build in delays between tests. Lastly, Postman Runner can load external JSON and CSV formatted test data, which is beyond the scope of this post.
Postman contains a simple Run Summary UI for viewing test results.
To support running tests from the command line, Postman provides Newman. According to Postman, Newman is a command-line collection runner for Postman. Newman offers the same functionality as Postman’s Collection Runner, all part of the
newman CLI. Newman is Node.js module, installed globally as an npm package,
npm install newman --global.
Typically, Development and Testing teams compose Postman Collections and define Postman Environments, locally. Teams run their tests locally in Postman, during their development cycle. Then, those same Postman Collections are executed from the command line, or more commonly as part of a CI/CD pipeline, such as with Jenkins.
Below, the same Collection of integration tests ran in the Postman Runner UI, are run from the command line, using Newman.
Without a doubt, Jenkins is the leading open-source CI/CD automation server. The building, testing, publishing, and deployment of microservices to Kubernetes is relatively easy with Jenkins. Generally, you would build, unit-test, push a new Docker image, and then deploy your application to Kubernetes using a series of CI/CD pipelines. Below, we see examples of these pipelines using Jenkins Blue Ocean, starting with a continuous integration pipeline, which includes unit-testing and Static Code Analysis (SCA) with SonarQube.
Followed by a pipeline to build the Docker Image, using the build artifact from the above pipeline, and pushes the Image to Docker Hub.
The third pipeline that demonstrates building the three Kubernetes environments and deploying v1 of the Election service to the dev namespace. This pipeline is just for demonstration purposes; typically, you would separate these functions.
An alternative to Jenkins for the deployment of microservices is Spinnaker, created by Netflix. According to Netflix, ‘Spinnaker is an open source, multi-cloud continuous delivery platform for releasing software changes with high velocity and confidence.’ Spinnaker is designed to integrate easily with Jenkins, dividing responsibilities for continuous integration and delivery, with deployment. Below, Spinnaker two sample deployment pipelines, similar to Jenkins, for deploying v1 and v2 of the Election service to the non-prod GKE cluster.
Below, Spinnaker has deployed v2 of the Election service to dev using a Highlander deployment strategy. Subsequently, Spinnaker has deployed v2 to test using a Red/Black deployment strategy, leaving the previously released v1 Server Group in place, in case a rollback is required.
Once Spinnaker is has completed the deployment tasks, the Postman Collections of smoke and integration tests are executed by Newman, as part of another Jenkins CI/CD pipeline.
In this pipeline, a set of basic smoke tests is run first to ensure the new deployment is running properly, and then the integration tests are executed.
Newman offers several options for displaying test results. For easy integration with Jenkins, Newman results can be delivered in a format that can be displayed as JUnit test reports. The JUnit test report format, XML, is a popular method of standardizing test results from different testing tools. Below is a truncated example of a test report file (gist).
Translating Newman test results to JUnit reports allows the percentage of test cases successfully executed, to be tracked over multiple deployments, a universal testing metric. Below we see the JUnit Test Reports Test Result Trend graph for a series of test runs.
Deploying to Development
Development environments typically have a rapid turnover of application versions. Many teams use their Development environment as a continuous integration environment, where every commit that successfully builds and passes all unit tests, is deployed. The purpose of the CI deployments is to ensure build artifacts will successfully deploy through the CI/CD pipeline, start properly, and pass a basic set of smoke tests.
Other teams use the Development environments as an extension of their local Minikube environment. The Development environment will possess some or all of the required external integration points, which the Developer’s local Minikube environment may not. The goal of the Development environment is to help Developers ensure their application is functioning correctly and is ready for the Test teams to evaluate, prior to promotion to the Test environment.
Some external integration points, such as external payment gateways, customer relationship management (CRM) systems, content management systems (CMS), or data analytics engines, are often stubbed-out in lower environments. Generally, third-party providers only offer a limited number of parallel non-Production integration environments. While an application may pass through several non-prod environments, testing against all external integration points will only occur in one or two of those environments.
With v2 of the Election service ready for testing on GKE, we deploy it to the GKE cluster’s dev namespace using the
part4a-deploy-v2-dev.sh script. We will also delete the previous v1 version of the Election service. Similar to the v1 deployment script, the v2 scripts perform a
kube-inject command, which manually injects the Istio sidecar proxy alongside the Election service, into each election v2 Pod. The deployment script also deploys an alternate Istio Route Rule, which routes requests to
api.dev.voter-demo.com/v2/* resource of v2 of the Election service.
Once deployed, we run our Postman Collection of integration tests with Newman or as part of a CI/CD pipeline. In the Development environment, we may choose to run a limited set of tests for the sake of expediency, or because not all external integration points are accessible.
Promotion to Test
With local Minikube and Development environment testing complete, we promote and deploy v2 of the Election service to the Test environment, using the
part4b-deploy-v2-test.sh script. In Test, we will not delete v1 of the Election service.
Often, an organization will maintain a running copy of all versions of an application currently deployed to Production, in a lower environment. Let’s look at two scenarios where this is common. First, v1 of the Election service has an issue in Production, which needs to be confirmed and may require a hot-fix by the Development team. Validation of the v1 Production bug is often done in a lower environment. The second scenario for having both versions running in an environment is when v1 and v2 both need to co-exist in Production. Organizations frequently support multiple API versions. Cutting over an entire API user-base to a new API version is often completed over a series of releases, and requires careful coordination with API consumers.
Testing All Versions
An essential role of integration testing should be to confirm that both versions of the Election service are functioning correctly, while simultaneously running in the same namespace. For example, we want to verify traffic is routed correctly, based on the HTTP request URL, to the correct version. Another common test scenario is database schema changes. Suppose we make what we believe are backward-compatible database changes to v2 of the Election service. We should be able to prove, through testing, that both the old and new versions function correctly against the latest version of the database schema.
There are different automation strategies that could be employed to test multiple versions of an application without creating separate Collections and Environments. A simple solution would be to templatize the Environments file, and then programmatically change the Postman Environment’s
version variable injected from a pipeline parameter (abridged environment file shown below).
Once initial automated integration testing is complete, Test teams will typically execute additional forms of application testing if necessary, before signing off for UAT and Performance Testing to begin.
With testing in the Test environments completed, we continue onto UAT. The term UAT suggest that a set of actual end-users (API consumers) of the Election service will perform their own testing. Frequently, UAT is only done for a short, fixed period of time, often with a specialized team of Testers. Issues experienced during UAT can be expensive and impact the ability to release an application to Production on-time if sign-off is delayed.
After deploying v2 of the Election service to UAT, and before opening it up to the UAT team, we would naturally want to repeat the same integration testing process we conducted in the previous Test environment. We must ensure that v2 is functioning as expected before our end-users begin their testing. This is where leveraging a tool like Jenkins makes automated integration testing more manageable and repeatable. One strategy would be to duplicate our existing Development and Test pipelines, and re-target the new pipeline to call v2 of the Election service in UAT.
Again, in a JUnit report format, we can examine individual results through the Jenkins Console.
We can also examine individual results from each test run using a specific build’s Console Output.
Testing and Instrumentation
To fully evaluate the integration test results, you must look beyond just the percentage of test cases executed successfully. It makes little sense to release a new version of an application if it passes all functional tests, but significantly increases client response times, unnecessarily increases memory consumption or wastes other compute resources, or is grossly inefficient in the number of calls it makes to the database or third-party dependencies. Often times, integration testing uncovers potential performance bottlenecks that are incorporated into performance test plans.
Critical intelligence about the performance of the application can only be obtained through the use of logging and metrics collection and instrumentation. Istio provides this telemetry out-of-the-box with Zipkin, Jaeger, Service Graph, Fluentd, Prometheus, and Grafana. In the included Grafana Istio Dashboard below, we see the performance of v1 of the Election service, under test, in the Test environment. We can compare request and response payload size and timing, as well as request and response times to external integration points, such as our Amazon RDS database. We are able to observe the impact of individual test requests on the application and all its integration points.
As part of integration testing, we should monitor the Amazon RDS CloudWatch metrics. CloudWatch allows us to evaluate critical database performance metrics, such as the number of concurrent database connections, CPU utilization, read and write IOPS, Memory consumption, and disk storage requirements.
A discussion of metrics starts moving us toward load and performance testing against Production service-level agreements (SLAs). Using a similar approach to integration testing, with load and performance testing, we should be able to accurately estimate the sizing requirements our new application for Production. Load and Performance Testing helps answer questions like the type and size of compute resources are required for our GKE Production cluster and for our Amazon RDS database, or how many compute nodes and number of instances (Pods) are necessary to support the expected user-load.
All opinions expressed in this post are my own, and not necessarily the views of my current or past employers, or their clients.
In the following two-part post, we will explore the creation of a GKE cluster, replete with the latest version of Istio, often referred to as IoK (Istio on Kubernetes). We will then deploy, perform integration testing, and promote an application across multiple environments within the cluster.
Application Environment Management
Container orchestration engines, such as Kubernetes, have revolutionized the deployment and management of microservice-based architectures. Combined with a Service Mesh, such as Istio, Kubernetes provides a secure, instrumented, enterprise-grade platform for modern, distributed applications.
One of many challenges with any platform, even one built on Kubernetes, is managing multiple application environments. Whether applications run on bare-metal, virtual machines, or within containers, deploying to and managing multiple application environments increases operational complexity.
As Agile software development practices continue to increase within organizations, the need for multiple, ephemeral, on-demand environments also grows. Traditional environments that were once only composed of Development, Test, and Production, have expanded in enterprises to include a dozen or more environments, to support the many stages of the modern software development lifecycle. Current application environments often include Continous Integration and Delivery (CI), Sandbox, Development, Integration Testing (QA), User Acceptance Testing (UAT), Staging, Performance, Production, Disaster Recovery (DR), and Hotfix. Each environment requiring its own compute, security, networking, configuration, and corresponding dependencies, such as databases and message queues.
Environments and Kubernetes
There are various infrastructure architectural patterns employed by Operations and DevOps teams to provide Kubernetes-based application environments to Development teams. One pattern consists of separate physical Kubernetes clusters. Separate clusters provide a high level of isolation. Isolation offers many advantages, including increased performance and security, the ability to tune each cluster’s compute resources to meet differing SLAs, and ensuring a reduced blast radius when things go terribly wrong. Conversely, separate clusters often result in increased infrastructure costs and operational overhead, and complex deployment strategies. This pattern is often seen in heavily regulated, compliance-driven organizations, where security, auditability, and separation of duties are paramount.
An alternative to separate physical Kubernetes clusters is virtual clusters. Virtual clusters are created using Kubernetes Namespaces. According to Kubernetes documentation, ‘Kubernetes supports multiple virtual clusters backed by the same physical cluster. These virtual clusters are called namespaces’.
In most enterprises, Operations and DevOps teams deliver a combination of both virtual and physical Kubernetes clusters. For example, lower environments, such as those used for Development, Test, and UAT, often reside on the same physical cluster, each in a separate virtual cluster (namespace). At the same time, environments such as Performance, Staging, Production, and DR, often require the level of isolation only achievable with physical Kubernetes clusters.
In the Cloud, physical clusters may be further isolated and secured using separate cloud accounts. For example, with AWS you might have a Non-Production AWS account and a Production AWS account, both managed by an AWS Organization.
In a multi-environment scenario, a single physical cluster would contain multiple namespaces, into which separate versions of an application or applications are independently deployed, accessed, and tested. Below we see a simple example of a single Kubernetes non-prod cluster on the left, containing multiple versions of different microservices, deployed across three namespaces. You would likely see this type of deployment pattern as applications are deployed, tested, and promoted across lower environments, before being released to Production.
To demonstrate the promotion and testing of an application across multiple environments, we will use a simple election-themed microservice, developed for a previous post, Developing Cloud-Native Data-Centric Spring Boot Applications for Pivotal Cloud Foundry. The Spring Boot-based application allows API consumers to create, read, update, and delete, candidates, elections, and votes, through an exposed set of resources, accessed via RESTful endpoints.
All source code for this post can be found on GitHub. The project’s README file contains a list of the Election microservice’s endpoints. To get started quickly, use one of the two following options (gist).
Code samples in this post are displayed as Gists, which may not display correctly on some mobile and social media browsers. Links to gists are also provided.
This project includes a kubernetes sub-directory, containing all the Kubernetes resource files and scripts necessary to recreate the example shown in the post. The scripts are designed to be easily adapted to a CI/CD DevOps workflow. You will need to modify the script’s variables to match your own environment’s configuration.
The post’s Spring Boot application relies on a PostgreSQL database. In the previous post, ElephantSQL was used to host the PostgreSQL instance. This time, I have used Amazon RDS for PostgreSQL. Amazon RDS for PostgreSQL and ElephantSQL are equivalent choices. For simplicity, you might also consider a containerized version of PostgreSQL, managed as part of your Kubernetes environment.
Ideally, each environment should have a separate database instance. Separate database instances provide better isolation, fine-grained RBAC, easier test data lifecycle management, and improved performance. Although, for this post, I suggest a single, shared, minimally-sized RDS instance.
The PostgreSQL database’s sensitive connection information, including database URL, username, and password, are stored as Kubernetes Secrets, one secret for each namespace, and accessed by the Kubernetes Deployment controllers.
Although not required, Istio makes the task of managing multiple virtual and physical clusters significantly easier. Following Istio’s online installation instructions, download and install Istio 0.7.1.
To create a Google Kubernetes Engine (GKE) cluster with Istio, you could use
container clusters create command, followed by installing Istio manually using Istio’s supplied Kubernetes resource files. This was the method used in the previous post, Deploying and Configuring Istio on Google Kubernetes Engine (GKE).
Alternatively, you could use Istio’s Google Cloud Platform (GCP) Deployment Manager files, along with the
deployment-manager deployments create command to create a Kubernetes cluster, replete with Istio, in a single step. Although arguably simpler, the
deployment-manager method does not provide the same level of fine-grain control over cluster configuration as the container clusters create method. For this post, the
deployment-manager method will suffice.
The latest version of the Google Kubernetes Engine, available at the time of this post, is 1.9.6-gke.0. However, to install this version of Kubernetes Engine using the Istio’s supplied deployment Manager Jinja template requires updating the hardcoded value in the
istio-cluster.jinja file from 1.9.2-gke.1. This has been updated in the next release of Istio.
Another change, the latest version of Istio offered as an option in the istio-cluster-jinja.schema file. Specifically, the
installIstioRelease configuration variable is only 0.6.0. The template does not include 0.7.1 as an option. Modify the
istio-cluster-jinja.schema file to include the choice of 0.7.1. Optionally, I also set 0.7.1 as the default. This change should also be included in the next version of Istio.
There are a limited number of GKE and Istio configuration defaults defined in the
istio-cluster.yaml file, all of which can be overridden from the command line.
To optimize the cluster, and keep compute costs to a minimum, I have overridden several of the default configuration values using the properties flag with the gcloud CLI’s
deployment-manager deployments create command. The README file provided by Istio explains how to use this feature. Configuration changes include the name of the cluster, the version of Istio (0.7.1), the number of nodes (2), the GCP zone (us-east1-b), and the node instance type (n1-standard-1). I also disabled automatic sidecar injection and chose not to install the Istio sample book application onto the cluster (gist).
To provision the GKE cluster and deploy Istio, first modify the variables in the
part1-create-gke-cluster.sh file (shown above), then execute the script. The script also retrieves your cluster’s credentials, to enable command line interaction with the cluster using the
Once complete, validate the version of Istio by examining Istio’s Docker image versions, using the following command (gist).
The result should be a list of Istio 0.7.1 Docker images.
The new cluster should be running GKE version 1.9.6.gke.0. This can be confirmed using the following command (gist).
Or, from the GCP Cloud Console.
The new GKE cluster should be composed of (2) n1-standard-1 nodes, running in the us-east-1b zone.
As part of the deployment, all of the separate Istio components should be running within the
As part of the deployment, an external IP address and a load balancer were provisioned by GCP and associated with the Istio Ingress. GCP’s Deployment Manager should have also created the necessary firewall rules for cluster ingress and egress.
Building the Environments
Next, we will create three namespaces,
uat, which represent three non-production environments. Each environment consists of a Kubernetes Namespace, Istio Ingress, and Secret. The three environments are deployed using the
Deploying Election v1
For this demonstration, we will assume v1 of the Election service has been previously promoted, tested, and released to Production. Hence, we would expect v1 to be deployed to each of the lower environments. Additionally, a new v2 of the Election service has been developed and tested locally using Minikube. It is ready for deployment to the three environments and will undergo integration testing (detailed in Part Two of the post).
If you recall from our GKE/Istio configuration, we chose manual sidecar injection of the Istio proxy. Therefore, all election deployment scripts perform a
kube-inject command. To connect to our external Amazon RDS database, this
kube-inject command requires the
includeIPRanges flag, which contains two cluster configuration values, the cluster’s IPv4 CIDR (
clusterIpv4Cidr) and the service’s IPv4 CIDR (
Before deployment, we export the
includeIPRanges value as an environment variable, which will be used by the deployment scripts, using the following command,
export IP_RANGES=$(sh ./get-cluster-ip-ranges.sh). The
get-cluster-ip-ranges.sh script is shown below (gist).
Using this method with manual sidecar injection is discussed in the previous post, Deploying and Configuring Istio on Google Kubernetes Engine (GKE).
To deploy v1 of the Election service to all three namespaces, execute the
We should now have two instances of v1 of the Election service, running in the
uat namespaces, for a total of six election-v1 Kubernetes Pods.
HTTP Request Routing
Before deploying additional versions of the Election service in Part Two of this post, we should understand how external HTTP requests will be routed to different versions of the Election service, in multiple namespaces. In the post’s simple example, we have a matrix of three namespaces and two versions of the Election service. That means we need a method to route external traffic to up to six different election versions. There multiple ways to solve this problem, each with their own pros and cons. For this post, I found a combination of DNS and HTTP request rewriting is most effective.
First, to route external HTTP requests to the correct namespace, we will use subdomains. Using my current DNS management solution, Azure DNS, I create three new A records for my registered domain,
voter-demo.com. There is one A record for each namespace, including
All three subdomains should resolve to the single external IP address assigned to the cluster’s load balancer.
istio-ingress service load balancer, running in the
istio-system namespace, routes inbound external traffic, based on the Request URL, to the Istio Ingress in the appropriate namespace.
The Istio Ingress in the namespace then directs the traffic to one of the Kubernetes Pods, containing the Election service and the Istio sidecar proxy.
To direct the HTTP request to v1 or v2 of the Election service, an Istio Route Rule is used. As part of the environment creation, along with a Namespace and Ingress resources, we also deployed an Istio Route Rule to each environment. This particular route rule examines the HTTP request URL for a
/v2/ sub-collection resource. If it finds the sub-collection resource, it performs a HTTPRewrite, removing the sub-collection resource from the HTTP request. The Route Rule then directs the HTTP request to the appropriate version of the Election service, v1 or v2 (gist).
According to Istio, ‘if there are multiple registered instances with the specified tag(s), they will be routed to based on the load balancing policy (algorithm) configured for the service (round-robin by default).’ We are using the default load balancing algorithm to distribute requests across multiple copies of each Election service.
The final external HTTP request routing for the Election service in the Non-Production GKE cluster is shown on the left, in the diagram, below. Every Election service Pod also contains an Istio sidecar proxy instance.
Below are some examples of HTTP GET requests that would be successfully routed to our Election service, using the above-described routing strategy (gist).
In Part One of this post, we created the Kubernetes cluster on the Google Cloud Platform, installed Istio, provisioned a PostgreSQL database, and configured DNS for routing. Under the assumption that v1 of the Election microservice had already been released to Production, we deployed v1 to each of the three namespaces.
In Part Two of this post, we will learn how to utilize the sophisticated API testing capabilities of Postman and Newman to ensure v2 is ready for UAT and release to Production. We will deploy and perform integration testing of a new, v2 of the Election microservice, locally, on Kubernetes Minikube. Once we are confident v2 is functioning as intended, we will promote and test v2, across the
All opinions expressed in this post are my own, and not necessarily the views of my current or past employers, or their clients.
Architecting Cloud-Optimized Apps with AKS (Azure’s Managed Kubernetes), Azure Service Bus, and Cosmos DB
An earlier post, Eventual Consistency: Decoupling Microservices with Spring AMQP and RabbitMQ, demonstrated the use of a message-based, event-driven, decoupled architectural approach for communications between microservices, using Spring AMQP and RabbitMQ. This distributed computing model is known as eventual consistency. To paraphrase microservices.io, ‘using an event-driven, eventually consistent approach, each service publishes an event whenever it updates its data. Other services subscribe to events. When an event is received, a service (subscriber) updates its data.’
That earlier post illustrated a fairly simple example application, the Voter API, consisting of a set of three Spring Boot microservices backed by MongoDB and RabbitMQ, and fronted by an API Gateway built with HAProxy. All API components were containerized using Docker and designed for use with Docker CE for AWS as the Container-as-a-Service (CaaS) platform.
Optimizing for Kubernetes on Azure
This post will demonstrate how a modern application, such as the Voter API, is optimized for Kubernetes in the Cloud (Kubernetes-as-a-Service), in this case, AKS, Azure’s new public preview of Managed Kubernetes for Azure Container Service. According to Microsoft, the goal of AKS is to simplify the deployment, management, and operations of Kubernetes. I wrote about AKS in detail, in my last post, First Impressions of AKS, Azure’s New Managed Kubernetes Container Service.
In addition to migrating to AKS, the Voter API will take advantage of additional enterprise-grade Azure’s resources, including Azure’s Service Bus and Cosmos DB, replacements for the Voter API’s RabbitMQ and MongoDB. There are several architectural options for the Voter API’s messaging and NoSQL data source requirements when moving to Azure.
- Keep Dockerized RabbitMQ and MongoDB – Easy to deploy to Kubernetes, but not easily scalable, highly-available, or manageable. Would require storage optimized Azure VMs for nodes, node affinity, and persistent storage for data.
- Replace with Cloud-based Non-Azure Equivalents – Use SaaS-based equivalents, such as CloudAMQP (RabbitMQ-as-a-Service) and MongoDB Atlas, which will provide scalability, high-availability, and manageability.
- Replace with Azure Service Bus and Cosmos DB – Provides all the advantages of SaaS-based equivalents, and additionally as Azure resources, benefits from being in the Azure Cloud alongside AKS.
The Kubernetes resource files and deployment scripts used in this post are all available on GitHub. This is the only project you need to clone to reproduce the AKS example in this post.
git clone \ --branch master --single-branch --depth 1 --no-tags \ https://github.com/garystafford/azure-aks-sb-cosmosdb-demo.git
The Docker images for the three Spring microservices deployed to AKS, Voter, Candidate, and Election, are available on Docker Hub. Optionally, the source code, including Dockerfiles, for the Voter, Candidate, and Election microservices, as well as the Voter Client are available on GitHub, in the
git clone \ --branch kub-aks --single-branch --depth 1 --no-tags \ https://github.com/garystafford/candidate-service.git git clone \ --branch kub-aks --single-branch --depth 1 --no-tags \ https://github.com/garystafford/election-service.git git clone \ --branch kub-aks --single-branch --depth 1 --no-tags \ https://github.com/garystafford/voter-service.git git clone \ --branch kub-aks --single-branch --depth 1 --no-tags \ https://github.com/garystafford/voter-client.git
Azure Service Bus
To demonstrate the capabilities of Azure’s Service Bus, the Voter API’s Spring microservice’s source code has been re-written to work with Azure Service Bus instead of RabbitMQ. A future post will explore the microservice’s messaging code. It is more likely that a large application, written specifically for a technology that is easily portable such as RabbitMQ or MongoDB, would likely remain on that technology, even if the application was lifted and shifted to the Cloud or moved between Cloud Service Providers (CSPs). Something important to keep in mind when choosing modern technologies – portability.
Service Bus is Azure’s reliable cloud Messaging-as-a-Service (MaaS). Service Bus is an original Azure resource offering, available for several years. The core components of the Service Bus messaging infrastructure are queues, topics, and subscriptions. According to Microsoft, ‘the primary difference is that topics support publish/subscribe capabilities that can be used for sophisticated content-based routing and delivery logic, including sending to multiple recipients.’
Since the three Voter API’s microservices are not required to produce messages for more than one other service consumer, Service Bus queues are sufficient, as opposed to a pub/sub model using Service Bus topics.
Cosmos DB, Microsoft’s globally distributed, multi-model database, offers throughput, latency, availability, and consistency guarantees with comprehensive service level agreements (SLAs). Ideal for the Voter API, Cosmos DB supports MongoDB’s data models through the MongoDB API, a MongoDB database service built on top of Cosmos DB. The MongoDB API is compatible with existing MongoDB libraries, drivers, tools, and applications. Therefore, there are no code changes required to convert the Voter API from MongoDB to Cosmos DB. I simply had to change the database connection string.
NGINX Ingress Controller
Although the Voter API’s HAProxy-based API Gateway could be deployed to AKS, it is not optimal for Kubernetes. Instead, the Voter API will use an NGINX-based Ingress Controller. NGINX will serve as an API Gateway, as HAProxy did, previously.
According to NGINX, ‘an Ingress is a Kubernetes resource that lets you configure an HTTP load balancer for your Kubernetes services. Such a load balancer usually exposes your services to clients outside of your Kubernetes cluster.’
An Ingress resource requires an Ingress Controller to function. Continuing from NGINX, ‘an Ingress Controller is an application that monitors Ingress resources via the Kubernetes API and updates the configuration of a load balancer in case of any changes. Different load balancers require different Ingress controller implementations. In the case of software load balancers, such as NGINX, an Ingress controller is deployed in a pod along with a load balancer.’
There are currently two NGINX-based Ingress Controllers available, one from Kubernetes and one directly from NGINX. Both being equal, for this post, I chose the Kubernetes version, without RBAC (Kubernetes offers a version with and without RBAC). RBAC should always be used for actual cluster security. There are several advantages of using either version of the NGINX Ingress Controller for Kubernetes, including Layer 4 TCP and UDP and Layer 7 HTTP load balancing, reverse proxying, ease of SSL termination, dynamically-configurable path-based rules, and support for multiple hostnames.
Azure Web App
Lastly, the Voter Client application, not really part of the Voter API, but useful for demonstration purposes, will be converted from a containerized application to an Azure Web App. Since it is not part of the Voter API, separating the Client application from AKS makes better architectural sense. Web Apps are a powerful, richly-featured, yet incredibly simple way to host applications and services on Azure. For more information on using Azure Web Apps, read my recent post, Developing Applications for the Cloud with Azure App Services and MongoDB Atlas.
Revised Component Architecture
Below is a simplified component diagram of the new architecture, including Azure Service Bus, Cosmos DB, and the NGINX Ingress Controller. The new architecture looks similar to the previous architecture, but as you will see, it is actually very different.
To understand the role of each API component, let’s look at one of the event-driven, decoupled process flows, the creation of a new election candidate. In the simplified flow diagram below, an API consumer executes an HTTP POST request containing the new candidate object as JSON. The Candidate microservice receives the HTTP request and creates a new document in the Cosmos DB Voter database. A Spring
RepositoryEventHandler within the Candidate microservice responds to the document creation and publishes a Create Candidate event message, containing the new candidate object as JSON, to the Azure Service Bus Candidate Queue.
Independently, the Voter microservice is listening to the Candidate Queue. Whenever a new message is produced by the Candidate microservice, the Voter microservice retrieves the message off the queue. The Voter microservice then transforms the new candidate object contained in the incoming message to its own candidate data model and creates a new document in its own Voter database.
The same process flows exist between the Election and the Candidate microservices. The Candidate microservice maintains current elections in its database, which are retrieved from the Election queue.
It is useful to understand, the Candidate microservice’s candidate domain model is not necessarily identical to the Voter microservice’s candidate domain model. Each microservice may choose to maintain its own representation of a vote, a candidate, and an election. The Voter service transforms the new candidate object in the incoming message based on its own needs. In this case, the Voter microservice is only interested in a subset of the total fields in the Candidate microservice’s model. This is the beauty of decoupling microservices, their domain models, and their datastores.
The versions of the Voter API microservices used for this post only support Election Created events and Candidate Created events. They do not handle Delete or Update events, which would be necessary to be fully functional. For example, if a candidate withdraws from an election, the Voter service would need to be notified so no one places votes for that candidate. This would normally happen through a Candidate Delete or Candidate Update event.
Provisioning Azure Service Bus
First, the Azure Service Bus is provisioned. Provisioning the Service Bus may be accomplished using several different methods, including manually using the Azure Portal or programmatically using Azure Resource Manager (ARM) with PowerShell or Terraform. I chose to provision the Azure Service Bus and the two queues using the Azure Portal for expediency. I chose the Basic Service Bus Tier of service, of which there are three tiers, Basic, Standard, and Premium.
The application requires two queues, the
candidate.queue, and the
Provisioning Cosmos DB
Next, Cosmos DB is provisioned. Like Azure Service Bus, Cosmos DB may be provisioned using several methods, including manually using the Azure Portal, programmatically using Azure Resource Manager (ARM) with PowerShell or Terraform, or using the Azure CLI, which was my choice.
az cosmosdb create \ --name cosmosdb_instance_name_goes_here \ --resource-group resource_group_name_goes_here \ --location "East US=0" \ --kind MongoDB
The post’s Cosmos DB instance exists within the single East US Region, with no failover. In a real Production environment, you would configure Cosmos DB with multi-region failover. I chose MongoDB as the type of Cosmos DB database account to create. The allowed values are GlobalDocumentDB, MongoDB, Parse. All other settings were left to the default values.
The three Spring microservices each have their own database. You do not have to create the databases in advance of consuming the Voter API. The databases and the database collections will be automatically created when new documents are first inserted by the microservices. Below, the three databases and their collections have been created and populated with documents.
The GitHub project repository also contains three shell scripts to generate sample vote, candidate, and election documents. The scripts will delete any previous documents from the database collections and generate new sets of sample documents. To use, you will have to update the scripts with your own Voter API URL.
MongoDB Aggregation Pipeline
Each of the three Spring microservices uses Spring Data MongoDB, which takes advantage of MongoDB’s Aggregation Framework. According to MongoDB, ‘the aggregation framework is modeled on the concept of data processing pipelines. Documents enter a multi-stage pipeline that transforms the documents into an aggregated result.’ Below is an example of aggregation from the Candidate microservice’s
Aggregation aggregation = Aggregation.newAggregation( match(Criteria.where("election").is(election)), group("candidate").count().as("votes"), project("votes").and("candidate").previousOperation(), sort(Sort.Direction.DESC, "votes") );
To use MongoDB’s aggregation framework with Cosmos DB, it is currently necessary to activate the MongoDB Aggregation Pipeline Preview Feature of Cosmos DB. The feature can be activated from the Azure Portal, as shown below.
Cosmos DB Emulator
Be warned, Cosmos DB can be very expensive, even without database traffic or any Production-grade bells and whistles. Be careful when spinning up instances on Azure for learning purposes, the cost adds up quickly! In less than ten days, while writing this post, my cost was almost US$100 for the Voter API’s Cosmos DB instance.
I strongly recommend downloading the free Azure Cosmos DB Emulator to develop and test applications from your local machine. Although certainly not as convenient, it will save you the cost of developing for Cosmos DB directly on Azure.
With Cosmos DB, you pay for reserved throughput provisioned and data stored in containers (a collection of documents or a table or a graph). Yes, that’s right, Azure charges you per MongoDB collection, not even per database. Azure Cosmos DB’s current pricing model seems less than ideal for microservice architectures, each with their own database instance.
By default the reserved throughput, billed as Request Units (RU) per second or RU/s, is set to 1,000 RU/s per collection. For development and testing, you can reduce each collection to a minimum of 400 RU/s. The Voter API creates five collections at 1,000 RU/s or 5,000 RU/s total. Reducing this to a total of 2,000 RU/s makes Cosmos DB marginally more affordable to explore.
Building the AKS Cluster
An existing Azure Resource Group is required for AKS. I chose to use the latest available version of Kubernetes, 1.8.2.
# login to azure az login \ --username your_username \ --password your_password # create resource group az group create \ --resource-group resource_group_name_goes_here \ --location eastus # create aks cluster az aks create \ --name cluser_name_goes_here \ --resource-group resource_group_name_goes_here \ --ssh-key-value path_to_your_public_key \ --kubernetes-version 1.8.2 # get credentials to access aks cluster az aks get-credentials \ --name cluser_name_goes_here \ --resource-group resource_group_name_goes_here # display cluster's worker nodes kubectl get nodes --output=wide
By default, AKS will provision a three-node Kubernetes cluster using Azure’s Standard D1 v2 Virtual Machines. According to Microsoft, ‘D series VMs are general purpose VM sizes provide balanced CPU-to-memory ratio. They are ideal for testing and development, small to medium databases, and low to medium traffic web servers.’ Azure D1 v2 VM’s are based on Linux OS images, currently Debian 8 (Jessie), with 1 vCPU and 3.5 GB of memory. By default with AKS, each VM receives 30 GiB of Standard HDD attached storage.
You should always select the type and quantity of the cluster’s VMs and their attached storage, optimized for estimated traffic volumes and the specific workloads you are running. This can be done using the
--node-osdisk-size arguments with the
az aks create command.
The Voter API resources are deployed to its own Kubernetes Namespace,
voter-api. The NGINX Ingress Controller resources are deployed to a different namespace,
ingress-nginx. Separate namespaces help organize individual Kubernetes resources and separate different concerns.
voter-api namespace is created. Then, five required Kubernetes Secrets are created within the namespace. These secrets all contain sensitive information, such as passwords, that should not be shared. There is one secret for each of the three Cosmos DB database connection strings, one secret for the Azure Service Bus connection string, and one secret for the Let’s Encrypt SSL/TLS certificate and private key, used for secure HTTPS access to the Voter API.
The Voter API’s secrets are used to populate environment variables within the pod’s containers. The environment variables are then available for use within the containers. Below is a snippet of the Voter pods resource file showing how the Cosmos DB and Service Bus connection strings secrets are used to populate environment variables.
env: - name: AZURE_SERVICE_BUS_CONNECTION_STRING valueFrom: secretKeyRef: name: azure-service-bus key: connection-string - name: SPRING_DATA_MONGODB_URI valueFrom: secretKeyRef: name: azure-cosmosdb-voter key: connection-string
Shown below, the Cosmos DB and Service Bus connection strings secrets have been injected into the Voter container and made available as environment variables to the microservice’s executable JAR file on start-up. As environment variables, the secrets are visible in plain text. Access to containers should be tightly controlled through Kubernetes RBAC and Azure AD, to ensure sensitive information, such as secrets, remain confidential.
Next, the three Kubernetes ReplicaSet resources, corresponding to the three Spring microservices, are created using Deployment controllers. According to Kubernetes, a Deployment that configures a ReplicaSet is now the recommended way to set up replication. The Deployments specify three replicas of each of the three Spring Services, resulting in a total of nine Kubernetes Pods.
Each pod, by default, will be scheduled on a different node if possible. According to Kubernetes, ‘the scheduler will automatically do a reasonable placement (e.g. spread your pods across nodes, not place the pod on a node with insufficient free resources, etc.).’ Note below how each of the three microservice’s three replicas has been scheduled on a different node in the three-node AKS cluster.
Next, the three corresponding Kubernetes ClusterIP-type Services are created. And lastly, the Kubernetes Ingress is created. According to Kubernetes, the Ingress resource is an API object that manages external access to the services in a cluster, typically HTTP. Ingress provides load balancing, SSL termination, and name-based virtual hosting.
The Ingress configuration contains the routing rules used with the NGINX Ingress Controller. Shown below are the routing rules for each of the three microservices within the Voter API. Incoming API requests are routed to the appropriate pod and service port by NGINX.
apiVersion: extensions/v1beta1 kind: Ingress metadata: name: voter-ingress namespace: voter-api annotations: ingress.kubernetes.io/ssl-redirect: "true" spec: tls: - hosts: - api.voter-demo.com secretName: api-voter-demo-secret rules: - http: paths: - path: /candidate backend: serviceName: candidate servicePort: 8080 - path: /election backend: serviceName: election servicePort: 8080 - path: /voter backend: serviceName: voter servicePort: 8080
The screengrab below shows all of the Voter API resources created on AKS.
NGINX Ingress Controller
After completing the deployment of the Voter API, the NGINX Ingress Controller is created. It starts with creating the
ingress-nginx namespace. Next, the NGINX Ingress Controller is created, consisting of the NGINX Ingress Controller, three Kubernetes ConfigMap resources, and a default back-end application. The Controller and backend each have their own Service resources. Like the Voter API, each has three replicas, for a total of six pods. Together, the Ingress resource and NGINX Ingress Controller manage traffic to the Spring microservices.
The screengrab below shows all of the NGINX Ingress Controller resources created on AKS.
The NGINX Ingress Controller Service, shown above, has an external public IP address associated with itself. This is because that Service is of the type, Load Balancer. External requests to the Voter API will be routed through the NGINX Ingress Controller, on this IP address.
kind: Service apiVersion: v1 metadata: name: ingress-nginx namespace: ingress-nginx labels: app: ingress-nginx spec: externalTrafficPolicy: Local type: LoadBalancer selector: app: ingress-nginx ports: - name: http port: 80 targetPort: http - name: https port: 443 targetPort: https
If you are only using HTTPS, not HTTP, then the references to HTTP and port 80 in the Ingress configuration are unnecessary. The NGINX Ingress Controller’s resources are explained in detail in the GitHub documentation, along with further configuration instructions.
To provide convenient access to the Voter API and the Voter Client, my domain,
voter-demo.com, is associated with the public IP address associated with the Voter API Ingress Controller and with the public IP address associated with the Voter Client Azure Web App. DNS configuration is done through Azure’s DNS Zone resource.
TXT type records might not look as familiar as the
A type records. The
TXT records are required to associate the domain entries with the Voter Client Azure Web App. Browsing to http://www.voter-demo.com or simply http://voter-demo.com brings up the Voter Client.
The Client sends and receives data via the Voter API, available securely at https://api.voter-demo.com.
Routing API Requests
With the Pods, Services, Ingress, and NGINX Ingress Controller created and configured, as well as the Azure Layer 4 Load Balancer and DNS Zone, HTTP requests from API consumers are properly and securely routed to the appropriate microservices. In the example below, three back-to-back requests are made to the
voter/info API endpoint. HTTP requests are properly routed to one of the three Voter pod replicas using the default round-robin algorithm, as proven by the observing the different hostnames (pod names) and the IP addresses (private pod IPs) in each HTTP response.
Shown below is the final Voter API Azure architecture. To simplify the diagram, I have deliberately left out the three microservice’s ClusterIP-type Services, the three default back-end application pods, and the default back-end application’s ClusterIP-type Service. All resources shown below are within the single East US Azure region, except DNS, which is a global resource.
Shown below is the new Azure Resource Group created by Azure during the AKS provisioning process. The Resource Group contains the various resources required to support the AKS cluster, NGINX Ingress Controller, and the Voter API. Necessary Azure resources were automatically provisioned when I provisioned AKS and when I created the new Voter API and NGINX resources.
In addition to the Resource Group above, the original Resource Group contains the AKS Container Service resource itself, Service Bus, Cosmos DB, and the DNS Zone resource.
The Voter Client Web App, consisting of the Azure App Service and App Service plan resource, is located in a third, separate Resource Group, not shown here.
Cleaning Up AKS
A nice feature of AKS, running a
az aks delete command will delete all the Azure resources created as part of provisioning AKS, the API, and the Ingress Controller. You will have to delete the Cosmos DB, Service Bus, and DNS Zone resources, separately.
az aks delete \ --name cluser_name_goes_here \ --resource-group resource_group_name_goes_here
Taking advantage of Kubernetes with AKS, and the array of Azure’s enterprise-grade resources, the Voter API was shifted from a simple Docker architecture to a production-ready solution. The Voter API is now easier to manage, horizontally scalable, fault-tolerant, and marginally more secure. It is capable of reliably supporting dozens more microservices, with multiple replicas. The Voter API will handle a high volume of data transactions and event messages.
There is much more that needs to be done to productionalize the Voter API on AKS, including:
- Add multi-region failover of Cosmos DB
- Upgrade to Service Bus Standard or Premium Tier
- Optimized Azure VMs and storage for anticipated traffic volumes and application-specific workloads
- Implement Kubernetes RBAC
- Add Monitoring, logging, and alerting with Envoy or similar
- Secure end-to-end TLS communications with Itsio or similar
- Secure the API with OAuth and Azure AD
- Automate everything with DevOps – AKS provisioning, testing code, creating resources, updating microservices, and managing data
All opinions in this post are my own, and not necessarily the views of my current or past employers or their clients.
Kubernetes as a Service
On October 24, 2017, less than a month prior to writing this post, Microsoft released the public preview of Managed Kubernetes for Azure Container Service (AKS). According to Microsoft, the goal of AKS is to simplify the deployment, management, and operations of Kubernetes. According to Gabe Monroy, PM Lead, Containers @ Microsoft Azure, in a blog post, AKS ‘features an Azure-hosted control plane, automated upgrades, self-healing, easy scaling.’ Monroy goes on to say, ‘with AKS, customers get the benefit of open source Kubernetes without complexity and operational overhead.’
Unquestionably, Kubernetes has become the leading Container-as-a-Service (CaaS) choice, at least for now. Along with the release of AKS by Microsoft, there have been other recent announcements, which reinforce Kubernetes dominance. In late September, Rancher Labs announced the release of Rancher 2.0. According to Rancher, Rancher 2.0 would be based on Kubernetes. In mid-October, at DockerCon Europe 2017, Docker announced they were integrating Kubernetes into the Docker platform. Even AWS seems to be warming up to Kubernetes, despite their own ECS, according to sources. There are rumors AWS will announce a Kubernetes offering at AWS re:Invent 2017, starting a week from now.
Being a big fan of both Azure and Kubernetes, I decided to give AKS a try. I chose to deploy an existing, simple, multi-tier web application, which I had used in several previous posts, including Eventual Consistency: Decoupling Microservices with Spring AMQP and RabbitMQ. All the code used for this post is available on GitHub.
The application, the Voter application, is composed of an AngularJS frontend client-side UI, and two Java Spring Boot microservices, both backed by individual MongoDB databases, and fronted with an HAProxy-based API Gateway. The AngularJS UI calls the API Gateway, which in turn calls the Spring services. The two microservices communicate with each other using HTTP-based inter-process communication (IPC). Although I would prefer event-based service-to-service IPC, HTTP-based IPC was simpler to implement, for this post.
Interestingly, the Voter application was designed using Docker Community Edition for Mac and deployed to AWS using Docker Community Edition for AWS. Not only would this be my chance to preview AKS, but also an opportunity to compare the ease of developing for Docker CE on AWS using a Mac, to developing for Kubernetes with AKS using Docker Community Edition for Windows.
In order to develop for AKS on my Windows 10 Enterprise workstation, I first made sure I had the latest copies of the following software:
- Docker CE for Windows (v17.11.0)
- Azure CLI (v2.0.21)
- kubectl (v1.8.3)
- kompose (v1.4.0)
- Windows PowerShell (v5.1)
If you are following along with the post, make sure you have the latest version of the Azure CLI, minimally 2.0.21, according to the Azure CLI release notes. Also, I happen to be running the latest version of Docker CE from the Edge Channel. However, either channel’s latest release of Docker CE for Windows should work for this post. Using PowerShell is optional. I prefer PowerShell over working from the Windows Command Prompt, if for nothing else than to preserve my command history, by default.
Kubernetes Resources with Kompose
Originally developed for Docker CE, the Voter application stack was defined in a single Docker Compose file.
To work on AKS, the application stack’s configuration needs to be reproduced as Kubernetes configuration files. Instead of writing the configuration files manually, I chose to use kompose. Kompose is described on its website as ‘a conversion tool for Docker Compose to container orchestrators such as Kubernetes.’ Using kompose, I was able to automatically convert the Docker Compose file into analogous Kubernetes resource configuration files.
kompose convert -f docker-compose.yml
For the AngularJS Client Service and the HAProxy API Gateway Service, I had to modify the Service configuration files to switch the Service type to a Load Balancer (
type: LoadBalancer). Being a Load Balancer, Kubernetes will assign a publically accessible IP address to each Service; the reasons for which are explained later in the post.
The MongoDB service requires a persistent storage volume. To accomplish this with Kubernetes, kompose created a PersistentVolumeClaims resource configuration file. I did have to create a corresponding PersistentVolume resource configuration file. It was also necessary to modify the PersistentVolumeClaims resource configuration file, specifying the Storage Class Name as manual, to correspond to the AKS Storage Class configuration (
From the original Docker Compose file, containing five Docker services, I ended up with a dozen individual Kubernetes resource configuration files. Individual configuration files are optimal for fine-grain management of Kubernetes resources. The Docker Compose file and the Kubernetes resource configuration files are included in the GitHub project.
git clone \ --branch master --single-branch --depth 1 --no-tags \ https://github.com/garystafford/azure-aks-demo.git
Creating AKS Resources
New AKS Feature Flag
According to Microsoft, to start with AKS, while still a preview, creating new clusters requires a feature flag on your subscription.
az provider register -n Microsoft.ContainerService
Using a brand new Azure account for this demo, I also needed to activate two additional feature flags.
az provider register -n Microsoft.Network az provider register -n Microsoft.Compute
If you are missing required features flags, you will see errors, similar to. the below error.
Operation failed with status: ’Bad Request’. Details: Required resource provider registrations Microsoft.Compute,Microsoft.Network are missing.
AKS requires an Azure Resource Group for AKS. I chose to create a new Resource Group, using the Azure CLI.
az group create \ --resource-group resource_group_name_goes_here \ --location eastus
New Kubernetes Cluster
aks feature of the Azure CLI version 2.0.21 or later, I provisioned a new Kubernetes cluster. By default, Azure will create a 3-node cluster. You can override the default number of nodes using the
--node-count parameter; I chose one node. The version of Kubernetes you choose is also configurable using the
--kubernetes-version parameter. I selected the latest Kubernetes version available with AKS, 1.8.2.
az aks create \ --name cluser_name_goes_here \ --resource-group resource_group_name_goes_here \ --node-count 1 \ --generate-ssh-keys \ --kubernetes-version 1.8.2
The newly created Azure Resource Group and AKS Kubernetes Cluster were both then visible on the Azure Portal.
In addition to the new Resource Group I created, Azure also created a second Resource Group containing seven Azure resources. These Azure resources include a Virtual Machine (the single Kubernetes node), Network Security Group, Network Interface, Virtual Network, Route Table, Disk, and an Availability Group.
With AKS up and running, I used another Azure CLI command to create a proxy connection to the Kubernetes Dashboard, which was deployed automatically and was running within the new AKS Cluster. The Kubernetes Dashboard is a general purpose, web-based UI for Kubernetes clusters.
az aks browse \ --name cluser_name_goes_here \ --resource-group resource_group_name_goes_here
Although no applications were deployed to AKS, yet, there were several Kubernetes components running within the AKS Cluster. Kubernetes components running within the kube-system Namespace, included heapster, kube-dns, kubernetes-dashboard, kube-proxy, kube-svc-redirect, and tunnelfront.
Deploying the Application
MongoDB should be deployed first. Both the Voter and Candidate microservices depend on MongoDB. MongoDB is composed of four Kubernetes resources, a Deployment resource, Service resource, PersistentVolumeClaim resource, and PersistentVolume resource. I used
kubectl, the command line interface for running commands against Kubernetes clusters, to create the four MongoDB resources, from the configuration files.
kubectl create \ -f voter-data-vol-persistentvolume.yaml \ -f voter-data-vol-persistentvolumeclaim.yaml \ -f mongodb-deployment.yaml \ -f mongodb-service.yaml
After MongoDB was deployed and running, I created the four remaining Deployment resources, Client, Gateway, Voter, and Candidate, from the Deployment resource configuration files. According to Kubernetes, ‘a Deployment controller provides declarative updates for Pods and ReplicaSets. You describe a desired state in a Deployment object, and the Deployment controller changes the actual state to the desired state at a controlled rate.’
Lastly, I created the remaining Service resources from the Service resource configuration files. According to Kubernetes, ‘a Service is an abstraction which defines a logical set of Pods and a policy by which to access them.’
Switching back to the Kubernetes Dashboard, the Voter application components were now visible.
There were five Kubernetes Pods, one for each application component. Since there is only one Node in the Kubernetes Cluster, all five Pods were deployed to the same Node. There were also five corresponding Kubernetes Deployments.
Similarly, there were five corresponding Kubernetes ReplicaSets, the next-generation Replication Controller. There were also five corresponding Kubernetes Services. Note the Gateway and Client Services have an External Endpoint (External IP) associated with them. The IPs were created as a result of adding the Load Balancer Service type to their Service resource configuration files, mentioned earlier.
Lastly, note the Persistent Disk Claim for MongoDB, which had been successfully bound.
Switching back to the Azure Portal, following the application deployment, there were now three additional resources in the AKS Resource Group, a new Azure Load Balancer and two new Public IP Addresses. The Load Balancer is used to balance the Client and Gateway Services, which both have public IP addresses.
To confirm the Gateway, Voter, and Candidate Services were reachable, using the public IP address of the Gateway Service, I browsed to HAProxy’s Statistics web page. Note the two backends, candidate and voter. The green color means HAProxy was able to successfully connect to both of these Services.
Accessing the Application
The Voter application’s AngularJS UI frontend can be accessed using the Client Service’s public IP address. However, this would not be very user-friendly. Even if I brought up the UI, using the public IP, the UI would be unable to connect to the HAProxy API Gateway, and subsequently, the Voter or Candidate Services. Based on the Client’s current configuration, the Client is expecting to find the Gateway at
To make accessing the Client more user-friendly, and to ensure the Client has access to the Gateway, I provisioned an Azure DNS Zone resource for my domain,
voter-demo.com. I assigned the DNS Zone to the AKS Resource Group.
Within the new DNS Zone, I created three DNS records. The first record, an Alias (A) record, associated
voter-demo.com with the public IP address of the Client Service. I added a second Alias (A) record for the
www subdomain, also associating it with the public IP address of the Client Service. The third Alias (A) record associated the
api subdomain with the public IP address of the Gateway Service.
At a high level, the voter application’s routing architecture looks as follows. The client’s requests to the primary domain or to the
api subdomain are resolved to one of the two public IP addresses configured in the load balancer’s frontend. The requests are passed to the load balancer’s backend pool, containing the single Azure VM, which is the single Kubernetes node, and onto the client or gateway Kubernetes Service. From there, requests are routed to one the appropriate Kubernetes Pods, containing the containerized application components, client or gateway.
Using Chrome’s Developer Tools, observe when a new vote is placed, an HTTP
POST is made to the gateway, on the
http://api.voter-demo.com:8080/voter/votes. The Gateway then proxies this request to the Voter Service, at
http://voter:8080/voter/votes. Since the Gateway and Voter Services both run within the same Cluster, the Gateway is able to use the Voter service’s name to address it, using Kubernetes kube-dns.
In the past, I have developed, deployed, and managed containerized applications, using Rancher, AWS, AWS ECS, native Kubernetes, RedHat OpenShift, Docker Enterprise Edition, and Docker Community Edition. Based on my experience, and given my limited testing with Azure’s public preview of AKS, I am very impressed. Creating the Kubernetes Cluster could not have been easier. Scaling the Cluster, adding Persistent Volumes, and upgrading Kubernetes, is equally as easy. Conveniently, AKS integrates with other Kubernetes tools, like kubectl, kompose, and Helm.
I did run into some minor issues with the AKS Preview, such as being unable to connect to an earlier Cluster, after upgrading from the default Kubernetes version to 1.82. I also experience frequent disconnects when proxying to the Kubernetes Dashboard. I am sure the AKS Preview bugs will be worked out by the time AKS is officially released.
In addition to the many advantages of Kubernetes as a CaaS, a huge advantage of using AKS is the ability to easily integrate Azure’s many enterprise-grade compute, networking, database, caching, storage, and messaging resources. It would require minimal effort to swap out the Voter application’s single containerized version of MongoDB with a highly performant and available instance of Azure Cosmos DB. Similarly, it would be relatively easy to swap out the single containerized version of HAProxy with a fully-featured and secure instance of Azure API Management. The current version of the Voter application replies on RabbitMQ for service-to-service IPC versus this earlier application version’s reliance on HTTP-based IPC. It would be fairly simple to swap RabbitMQ for Azure Service Bus.
Lastly, AKS easily integrates with leading Development and DevOps tooling and processes. Building, managing, and deploying applications to AKS, is possible with Visual Studio, VSTS, Jenkins, Terraform, and Chef, according to Microsoft.
A few good references to get started with AKS:
- Container Orchestration Simplified with Managed Kubernetes in Azure Container Service (AKS)
- Deploy an Azure Container Service (AKS) cluster
- Prepare application for Azure Container Service (AKS)
- Kubernetes Basics
- kubectl Cheat Sheet
All opinions in this post are my own, and not necessarily the views of my current or past employers or their clients.