Posts Tagged docker

Getting Started with PySpark for Big Data Analytics, using Jupyter Notebooks and Docker

Introduction

There is little question, big data analytics, data science, artificial intelligence, and machine learning (a subcategory of artificial intelligence) have all experienced a tremendous surge in popularity over the last few years. Behind the hype curves and marketing buzz, big data, machine learning (ML), and artificial intelligence (AI) are influencing all aspects of our modern lives. Due to their popularity and potential benefits, academic institutions and commercial enterprises are rushing to train large numbers of Data Scientists and ML and AI Engineers.

Learning popular technologies, such as Python, Scala, R, Apache Hadoop, Apache Spark, and Apache Kafka, requires the use of multiple complex technologies. Installing, configuring, and managing these technologies often demands an advanced level of familiarity with Linux, distributed systems, cloud- and container-based platforms, databases, and data-streaming applications. This may prove a deterrent to Students, Mathematicians, Statisticians, and Data Scientists.

Driven by the explosive growth of these technologies and the need to train individuals, many commercial enterprises, as well as open-source projects are making it easier to get started. The three major cloud providers, AWS, Azure, and Google Cloud, all have multiple Big Data-, AI- and ML-as-Service-offerings. An excellent example of an open-source project working on this challenge is the Project Jupyter. Similar to the Spark Notebook and Apache Zeppelin projects, Jupyter Notebooks enables data-driven, interactive, and collaborative data analytics with Julia, Scala, Python, R, and SQL.

This post will demonstrate the creation of a containerized development environment, using Jupyter Notebooks. The environment will be suited for learning and developing applications for Apache Spark, using the Scala, Python, and R programming languages. This post is not intended to be a tutorial on Spark, PySpark, or Jupyter Notebooks.

Technologies

The following technologies are featured prominently in this post.

feature

Jupyter Notebooks

According to Project Jupyter, the Jupyter Notebook is an open-source web application that allows users to create and share documents that contain live code, equations, visualizations, and narrative text. Uses include data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more. The word, Jupyter, is a loose acronym for Julia, Python, and R, but today, the Jupyter supports many programming languages.

Jupyter Docker Stacks

To enable quick and easy access to Jupyter Notebooks, Project Jupyter has created Jupyter Docker Stacks. The stacks are ready-to-run Docker images containing Jupyter applications, along with accompanying technologies. Currently, eight different Jupyter Docker Stacks focus on a particular area of practice. They include SciPy (Python-based mathematics, science, and engineering), TensorFlow, R Project for statistical computing, Data Science with Julia, and the main subject of this post, PySpark. The stacks also include a rich variety of packages to extend their functionality, such as matplotlibggplot2, bokeh, ipywidgets , and Facets.

Apache Spark

According to Apache, Spark is a unified analytics engine for large-scale data processing, used by well-known, modern enterprises, such as Netflix, Yahoo, and eBay. With speeds up to 100x faster than Hadoop, Apache Spark achieves high performance for static, batch, and streaming data, using a state-of-the-art DAG (Directed Acyclic Graph) scheduler, a query optimizer, and a physical execution engine.

Spark’s polyglot programming model allows users to write applications quickly in Scala, Java, Python, R, and SQL. Spark includes libraries for Spark SQL (DataFrames and Datasets), MLlib (Machine Learning), GraphX (Graph Processing), and DStreams (Spark Streaming). You can run Spark using its standalone cluster mode, on Amazon EC2Apache Hadoop YARNMesos, or Kubernetes.

PySpark

The Spark Python API (PySpark) exposes the Spark programming model to Python. PySpark is built on top of Spark’s Java API. Data is processed in Python and cached and shuffled in the JVM. According to Apache, Py4J enables Python programs running in a Python interpreter to dynamically access Java objects in a JVM.

Docker

According to Docker, their technology developers and IT the freedom to build, manage and secure business-critical applications without the fear of technology or infrastructure lock-in. Although Kubernetes is now the leading open-source container orchestration platform, Docker is still the predominant underlying container engine.

Docker Swarm

Current versions of Docker include swarm mode for natively managing a cluster of Docker Engines, called a swarm. The Docker CLI is used to create a swarm, deploy application services to a swarm, and manage swarm behavior.

PostgreSQL

PostgreSQL is a powerful, open source object-relational database system. According to their website, PostgreSQL comes with many features aimed to help developers build applications, administrators to protect data integrity and build fault-tolerant environments, and help manage data no matter how big or small the dataset.

Demonstration

To show the capabilities of the Jupyter development environment, I will demonstrate a few typical use cases, such as executing Python scripts, submitting PySpark jobs, working with Jupyter Notebooks, and reading and writing data to and from different format files and to a database. We will be using the jupyter/all-spark-notebook Docker Image. This image includes Python, R, and Scala support for Apache Spark, using Apache Toree.

Architecture

As shown below, we will stand-up a Docker stack, consisting of Jupyter All-Spark-Notebook, PostgreSQL 10.5, and Adminer containers. The Docker stack will have local directories bind-mounted into the containers. Files from our GitHub project will be shared with the Jupyter application container through a bind-mounted directory. Our PostgreSQL data will also be persisted through a bind-mounted directory. This allows us to persist data external to the ephemeral containers.

PySparkDocker.png

Source Code

All open-sourced code for this post can be found on GitHub. Source code samples are displayed as GitHub Gists, which may not display correctly on some mobile and social media browsers.

Deploy Docker Stack

To start, create the $HOME/data/postgre directory to store PostgreSQL data files. This directory will be bind mounted into the PostgreSQL container on line 36 of the stack.yml file, $HOME/data/postgre:/var/lib/postgresql/data. The HOME environment variable assumes you are working on Linux or MacOS, and is equivalent to HOMEPATH on Windows.

The Jupyter container’s working directory is set on line 10 of the stack.yml file, working_dir:/home/$USER/workThe local bind-mounted working directory is $PWD/work. This path is bind-mounted to the working directory in the Jupyter container, on line 24 of the stack.yml file, $PWD/work:/home/$USER/work. The PWD environment variable assumes you are working on Linux or MacOS (CD on Windows).

By default, the user within the Jupyter container is jovyan. Optionally, I have chosen to override that user with my own local host’s user account, as shown on line 16 of the stack.yml file, NB_USER: $USER. I have used the MacOS host’s USER environment variable value (equivalent to USERNAME on Windows). There are many options for configuring the Jupyter container, detailed here. Several of those options are shown on lines 12-18 of the stack.yml file.

Assuming you have a recent version of Docker installed on your local development machine, and running in swarm mode, standing up the stack is as easy as running the following command from the root directory of the project:

docker stack deploy -c stack.yml pyspark

The Docker stack consists of a new overlay network, pyspark-net and the three containers. To confirm the stack deployed, you can run the following command:

docker stack ps pyspark --no-trunc

pyspark_article_01_stack_deploy

To access the Jupyter Notebook application, you need to obtain the Jupyter URL and access token (read more here). This information is output in the Jupyter container log, which can be accessed with the following command:

docker logs $(docker ps | grep _pyspark | awk '{print $NF}')

pyspark_article_02_pyspark_logs

Using the URL and token shown in the log output, you will be able to access the Jupyter web-based user interface on localhost port 8888. Once there, from Jupyter dashboard landing page, you should see all the files in the project’s work/ directory.

Also shown below, note the types of files you are able to create from the dashboard, including Python 3, R, Scala (using Toree or spylon-kernal), and text. You can also open a Jupyter Terminal or create a new Folder.
pyspark_article_04_working_dir

Running Python Scripts

Instead of worrying about installing and maintaining the latest version of Python and packages on your own development machine, we can run our Python scripts from the Jupyter container. At the time of this post, the latest jupyter/all-spark-notebook Docker Image runs Python 3.6.6. Let’s start with a simple example of the Jupyter container’s capabilities by running a Python script. I’ve included a sample Python script, 01_simple_script.py.

Run the script from within the Jupyter container, from a Jupyter Terminal window:

python ./01_simple_script.py

You should observe the following output.
pyspark_article_08_simple_script

Kaggle Datasets

To explore the features of the Jupyter Notebook container and PySpark, we will use a publically-available dataset from Kaggle. Kaggle is a fantastic open-source resource for datasets used for big-data and ML applications. Their tagline is ‘Kaggle is the place to do data science projects’.

For this demonstration, I chose the ‘Transactions from a Bakery’ dataset from Kaggle. The dataset contains 21,294 rows, each with four columns of data. Although certainly nowhere near ‘big data’, the dataset is large enough to test out the Jupyter container functionality.
pyspark_article_03_kaggle

Submitting Spark Jobs

We are not limited to Jupyter Notebooks to interact with Spark, we can also submit scripts directly to Spark from a Jupyter Terminal, or from our IDE. I have included a simple Python script, 02_bakery_dataframes.py. The script loads the Kaggle Bakery dataset from the CSV file into a Spark DataFrame. The script then prints out the top ten rows of data, along with a count of the total number of rows in the DataFrame.

Run the script directly from a Jupyter Terminal window:

python ./02_bakery_dataframes.py

An example of the output of the Spark job is shown below. At the time of this post, the latest jupyter/all-spark-notebook Docker Image runs Spark 2.3.1, Scala 2.11.8, Conda 4.5.11, and Java 1.8.0 using the OpenJDK.
pyspark_article_09_simple_spark

More typically, you would submit the Spark job, using the spark-submit command. Use a Jupyter Terminal window to run the following command:

$SPARK_HOME/bin/spark-submit 02_bakery_dataframes.py

Below, we see the beginning of the output from Spark, using the spark-submit command.
pyspark_article_09B1_spark_submit

Below, we see the scheduled tasks executing and the output of the print statement, displaying the top 10 rows of bakery data.

Interacting with Databases

Often with Spark, you are loading data from one or more data sources (input). After performing operations and transformations on the data, the data is persisted or conveyed to another system for further processing (output).

To demonstrate the flexibility of the Jupyter Docker Stacks to work with databases, I have added PostgreSQL to the Docker Stack. We can read and write data from the Jupyter container to the PostgreSQL instance, running in a separate container.

To begin, we will run a SQL script, written in Python, to create our database schema and some test data in a new database table. To do so, we will need to install the psycopg2 package into our Jupyter container. You can use the docker exec command, or as a superuser, your user has administrative access to install Python packages within the container or using the Jupyter Terminal window. Both pip and conda are available to install packages, see details here.

Run the following command to install psycopg2:

docker exec -it \
  $(docker ps | grep _pyspark | awk '{print $NF}') \
  pip install psycopg2 psycopg2-binary

These packages give Python the ability to interact with the PostgreSQL. The included Python script, 03_load_sql.py, will execute a set of SQL statements, contained in a SQL file, bakery_sample.sql, against the PostgreSQL container instance.

To execute the script, run the following command:

python ./03_load_sql.py

This should result in the following output, if successful.
pyspark_article_10_run_sql_py

To confirm the SQL script’s success, I have included Adminer. Adminer (formerly phpMinAdmin) is a full-featured database management tool written in PHP. Adminer natively recognizes PostgreSQL, MySQL, SQLite, and MongoDB, among other database engines.

Adminer should be available on localhost port 8080. The password credentials, shown below, are available in the stack.yml file. The server name, postgres, is the name of the PostgreSQL container. This is the domain name the Jupyter container will use to communicate with the PostgreSQL container.
pyspark_article_06_adminer_login

Connecting to the demo database with Adminer, we should see the bakery_basket table. The table should contain three rows of data, as shown below.
pyspark_article_07_bakery_data

Developing Jupyter NoteBooks

The true power of the Jupyter Docker Stacks containers is Jupyter Notebooks. According to the Jupyter Project, the notebook extends the console-based approach to interactive computing in a qualitatively new direction, providing a web-based application suitable for capturing the whole computation process: developing, documenting, and executing code, as well as communicating the results. Notebook documents contain the inputs and outputs of an interactive session as well as additional text that accompanies the code but is not meant for execution.

To see the power of Jupyter Notebooks, I have written a basic notebook document, 04_pyspark_demo_notebook.ipynb. The document performs some typical PySpark functions, such as loading data from a CSV file and from the PostgreSQL database, performing some basic data analytics with Spark SQL, graphing the data using BokehJS, and finally, saving data back to the database, as well as to the popular Apache Parquet file format. Below we see the notebook document, using the Jupyter Notebook user interface.

pyspark_article_11_notebook.png

PostgreSQL Driver

The only notebook document dependency, not natively part of the Jupyter Image, is the PostgreSQL JDBC driver. The driver, postgresql-42.2.5.jar, is included in the project and referenced in the configuration of the notebook’s Spark Session. The JAR is added to the spark.driver.extraClassPath runtime environment property. This ensures the JAR is available to Spark (written in Scala) when the job is run.

PyCharm

Since the working directory for the project is shared with the container, you can also edit files, including notebook documents, in your favorite IDE, such as JetBrains PyCharm. PyCharm has built-in language support for Jupyter Notebooks, as shown below.
pyspark_article_11_notebook_pycharm.png

As mentioned earlier, a key feature of Jupyter Notebooks is their ability to save the output from each Cell as part of the notebook document. Below, we see the notebook document on GitHub. The output is saved, as part of the notebook document. Not only can you distribute the notebook document, but you can also preserve and share the output from each cell.
pyspark_article_17_github

nbviewer

Notebooks can also be viewed using Jupyter nbviewer, hosted on Rackspace. Below, we see a section of this project’s notebook on nbviewer. You can view this project’s actual notebook document, using nbviewer, here.

pyspark_article_22_nbviewer.png

Monitoring Spark Jobs

The Jupyter Docker container exposes Spark’s monitoring and instrumentation web user interface. We can observe each Spark Job in great detail.
pyspark_article_12_spark_jobs

We can review details of each stage of the Spark job, including a visualization of the DAG, which Spark constructs as part of the job execution plan, using the DAG Scheduler.
pyspark_article_12_spark_dag

We can also review the timing of each event, occurring as part of the stages of the Spark job.
pyspark_article_12_timeline

We can also use the Spark interface to review and confirm the runtime environment, including versions of Java, Scala, and Spark, as well as packages available on the Java classpath.
pyspark_article_13_enviornment

Spark Performance

Spark, running on a single node within the Jupyter container, on your development system, is not a substitute for a full Spark cluster, running on bare metal or robust virtualized hardware, with YARN, Mesos, or Kubernetes. In my opinion, you should adjust Docker to support an acceptable performance profile for the stack, running only a modest workload. You are not trying to replace the need to run real jobs on a Production Spark cluster.
pyspark_article_16c_perf

We can use the docker stats command to examine the container’s CPU and memory metrics:

docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.MemPerc}}"

Below, we see the stats from the stack’s three containers immediately after being deployed, showing little or no activity. Here, Docker has been allocated 2 CPUs, 3GB of RAM, and 2 GB of swap space available, from the host machine.
pyspark_article_16a_perf

Compare the stats above with the same three containers, while the example notebook document is running on Spark. The CPU shows a spike, but memory usage appears to be within acceptable ranges.
pyspark_article_16b_perf

Linux top and htop

Another option to examine container performance metrics is with top. We can use the docker exec command to execute the top command within the Jupyter container, sorting processes by CPU usage:

docker exec -it \
  $(docker ps | grep _pyspark | awk '{print $NF}') \
  top -o %CPU

With top, we can observe the individual performance of each processes running in the Jupyter container.

pyspark_article_20_top.png

Lastly, htop, an interactive process viewer for Unix, can be installed into the container and ran with the following set of bash commands, from a Jupyter Terminal window or using docker exec:

docker exec -it \
  $(docker ps | grep _pyspark | awk '{print $NF}') \
  sh -c "apt-get update && apt-get install htop && htop --sort-key PERCENT_CPU"

With htop, we can observe individual CPU activity. The two CPUs at the top left of the htop window are the two CPUs assigned to Docker. We get insight into the way Docker is using each CPU, as well as other basic performance metrics, like memory and swap.

pyspark_article_16f_htop.png

Assuming your development machine host has them available, it is easy to allocate more compute resources to Docker if required. However, in my opinion, this stack is optimized for development and learning, using reasonably sized datasets for data analysis and ML. It should not be necessary to allocate excessive resources to Docker, possibly starving your host machine own compute capabilities.
pyspark_article_16e_perf

Conclusion

In this brief post, we have seen how easy it is to get started learning and developing applications for big data analytics, using Python, Spark, and PySpark, thanks to the Jupyter Docker Stacks. We could use the same stack to learn and develop for machine learning, using Python, Scala, and R. Extending the stack’s capabilities is as simple as swapping out this Jupyter image for another, with a different set of tools, as well as adding additional containers to the stack, such as Apache Kafka or Apache Cassandra.

All opinions expressed in this post are my own and not necessarily the views of my current or past employers, their clients.

, , , , , , , , ,

Leave a comment

Integrating Search Capabilities with Actions for Google Assistant, using GKE and Elasticsearch: Part 2

Introduction

Voice and text-based conversational interfaces, such as chatbots, have recently seen tremendous growth in popularity. Much of this growth can be attributed to leading Cloud providers, such as Google, Amazon, and Microsoft, who now provide affordable, end-to-end development, machine learning-based training, and hosting platforms for conversational interfaces.

Cloud-based machine learning services greatly improve a conversational interface’s ability to interpret user intent with greater accuracy. However, the ability to return relevant responses to user inquiries, also requires interfaces have access to rich informational datastores, and the ability to quickly and efficiently query and analyze that data.

In this two-part post, we will enhance the capabilities of a voice and text-based conversational interface by integrating it with a search and analytics engine. By interfacing an Action for Google Assistant conversational interface with Elasticsearch, we will improve the Action’s ability to provide relevant results to the end-user. Instead of querying a traditional database for static responses to user intent, our Action will access a  Near Realtime (NRT) Elasticsearch index of searchable documents. The Action will leverage Elasticsearch’s advanced search and analytics capabilities to optimize and shape user responses, based on their intent.

Action Preview

Here is a brief YouTube video preview of the final Action for Google Assistant, integrated with Elasticsearch, running on an Apple iPhone.

Architecture

If you recall from part one of this post, the high-level architecture of our search engine-enhanced Action for Google Assistant resembles the following. Most of the components are running on Google Cloud.

Google Search Assistant Diagram GCP

Source Code

All open-sourced code for this post can be found on GitHub in two repositories, one for the Spring Boot Service and one for the Action for Google Assistant. Code samples in this post are displayed as GitHub Gists, which may not display correctly on some mobile and social media browsers. Links to gists are also provided.

Development Process

In part two of this post, we will tie everything together by creating and integrating our Action for Google Assistant:

  • Create the new Actions for Google Assistant project using the Actions on Google console;
  • Develop the Action’s Intents and Entities using the Dialogflow console;
  • Develop, deploy, and test the Cloud Function to GCP;

Let’s explore each step in more detail.

New ‘Actions on Google’ Project

With Elasticsearch running and the Spring Boot Service deployed to our GKE cluster, we can start building our Actions for Google Assistant. Using the Actions on Google web console, we first create a new Actions project.

wp-search-021

The Directory Information tab is where we define metadata about the project. This information determines how it will look in the Actions directory and is required to publish your project. The Actions directory is where users discover published Actions on the web and mobile devices.

wp-search-019

The Directory Information tab also includes sample invocations, which may be used to invoke our Actions.

wp-search-020

Actions and Intents

Our project will contain a series of related Actions. According to Google, an Action is ‘an interaction you build for the Assistant that supports a specific intent and has a corresponding fulfillment that processes the intent.’ To build our Actions, we first want to create our Intents. To do so, we will want to switch from the Actions on Google console to the Dialogflow console. Actions on Google provides a link for switching to Dialogflow in the Actions tab.

wp-search-022

We will build our Action’s Intents in Dialogflow. The term Intent, used by Dialogflow, is standard terminology across other voice-assistant platforms, such as Amazon’s Alexa and Microsoft’s Azure Bot Service and LUIS. In Dialogflow, will be building Intents — the Find Multiple Posts Intent, Find Post Intent, Find By ID Intent, and so forth.

wp-search-023

Below, we see the Find Post Intent. The Find Post Intent is responsible for handling our user’s requests for a single post about a topic, for example, ‘Find a post about Docker.’ The Intent shown below contains a fair number, but indeed not an exhaustive list, of training phrases. These represent possible ways a user might express intent when invoking the Action.

wp-search-026

Below, we see the Find Multiple Posts Intent. The Find Multiple Posts Intent is responsible for handling our user’s requests for a list of posts about a topic, for example, ‘I’m interested in Docker.’ Similar to the Find Post Intent above, the Find Multiple Posts Intent contains a list of training phrases.

wp-search-025

Dialog Model Training

According to Google, the greater the number of natural language examples in the Training Phrases section of Intents, the better the classification accuracy. Every time a user interacts with our Action, the user’s utterances are logged. Using the Training tab in the Dialogflow console, we can train our model by reviewing and approving or correcting how the Action handled the user’s utterances.

Below we see the user’s utterances, part of an interaction with the Action. We have the option to review and approve the Intent that was called to handle the utterance, re-assign it, or delete it. This helps improve our accuracy of our dialog model.

wp-search-039.png

Dialogflow Entities

Each of the highlighted words in the training phrases maps to the facts parameter, which maps to a collection of @topic Entities. Entities represent a list of intents the Action is trained to understand.  According to Google, there are three types of entities: ‘system’ (defined by Dialogflow), ‘developer’ (defined by a developer), and ‘user’ (built for each individual end-user in every request) objects. We will be creating ‘developer’ type entities for our Action’s Intents.

wp-search-037.png

Automated Expansion

We do not have to define all possible topics a user might search for, as an entity.  By enabling the Allow Automated Expansion option, an Agent will recognize values that have not been explicitly listed in the entity list. Google describes Agents as NLU (Natural Language Understanding) modules.

wp-search-042.png

Entity Synonyms

An entity may contain synonyms. Multiple synonyms are mapped to a single reference value. The reference value is the value passed to the Cloud Function by the Action. For example, take the reference value of ‘GCP.’ The user might ask Google about ‘GCP’. However, the user might also substitute the words ‘Google Cloud’ or ‘Google Cloud Platform.’ Using synonyms, if the user utters any of these three synonymous words or phrase in their intent, the reference value, ‘GCP’, is passed in the request.

But, what if the post contains the phrase, ‘Google Cloud Platform’ more frequently than, or instead of, ‘GCP’? If the acronym, ‘GCP’, is defined as the entity reference value, then it is the value passed to the function, even if you ask for ‘Google Cloud Platform’. In the use case of searching blog posts by topic, entity synonyms are not an effective search strategy.

Elasticsearch Synonyms

A better way to solve for synonyms is by using the synonyms feature of Elasticsearch. Take, for example, the topic of ‘Istio’, Istio is also considered a Service Mesh. If I ask for posts about ‘Service Mesh’, I would like to get back posts that contain the phrase ‘Service Mesh’, but also the word ‘Istio’. To accomplish this, you would define an association between ‘Istio’ and ‘Service Mesh’, as part of the Elasticsearch WordPress posts index.

wp-search-041d

Searches for ‘Istio’ against that index would return results that contain ‘Istio’ and/or contain ‘Service Mesh’; the reverse is also true. Having created and applied a custom synonyms filter to the index, we see how Elasticsearch responds to an analysis of the natural language style phrase, ‘What is a Service Mesh?’. As shown by the tokens output in Kibana’s Dev Tools Console, Elasticsearch understands that ‘service mesh’ is synonymous with ‘istio’.

wp-search-041g

If we query the same five fields as our Action, for the topic of ‘service mesh’, we get four hits for posts (indexed documents) that contain ‘service mesh’ and/or ‘istio’.

wp-search-041c

Actions on Google Integration

Another configuration item in Dialogflow that needs to be completed is the Dialogflow’s Actions on Google integration. This will integrate our Action with Google Assistant. Google currently provides more than fifteen different integrations, including Google Assistant, Slack, Facebook Messanger, Twitter, and Twilio, as shown below.

wp-search-028

To configure the Google Assistant integration, choose the Welcome Intent as our Action’s Explicit Invocation intent. Then we designate our other Intents as Implicit Invocation intents. According to Google, this Google Assistant Integration allows our Action to reach users on every device where the Google Assistant is available.

wp-search-029

Action Fulfillment

When a user’s intent is received, it is fulfilled by the Action. In the Dialogflow Fulfillment console, we see the Action has two fulfillment options, a Webhook or an inline-editable Cloud Function, edited inline. A Webhook allows us to pass information from a matched intent into a web service and get a result back from the service. Our Action’s Webhook will call our Cloud Function on GCP, using the Cloud Function’s URL endpoint (we’ll get this URL in the next section).

wp-search-030

Google Cloud Functions

Our Cloud Function, called by our Action, is written in Node.js. Our function, index.js, is divided into four sections, which are: constants and environment variables, intent handlers, helper functions, and the function’s entry point. The helper functions are part of the Helper module, contained in the helper.js file.

Constants and Environment Variables

The section, in both index.js and helper.js, defines the global constants and environment variables used within the function. Values that reference environment variables, such as SEARCH_API_HOSTNAME are defined in the .env.yaml file. All environment variables in the .env.yaml file will be set during the Cloud Function’s deployment, described later in this post. Environment variables were recently released, and are still considered beta functionality (gist).

The npm module dependencies declared in this section are defined in the dependencies section of the package.json file. Function dependencies include Actions on Google, Firebase Functions, Winston, and Request (gist).

Intent Handlers

The intent handlers in this section correspond to the intents in the Dialogflow console. Each handler responds with a SimpleResponse, BasicCard, and Suggestion Chip response types, or  Simple Response, List, and Suggestion Chip response types. These response types were covered in part one of this post. (gist).

The Welcome Intent handler handles explicit invocations of our Action. The Fallback Intent handler handles both help requests, as well as cases when Dialogflow is unable to handle the user’s request.

As described above in the Dialogflow section, the Find Post Intent handler is responsible for handling our user’s requests for a single post about a topic. For example, ‘Find a post about Docker’. To fulfill the user request, the Find Post Intent handler, calls the Helper module’s getPostByTopic function, passing the topic requested and specifying a result set size of one post with the highest relevance score higher than an arbitrary value of  1.0.

Similarly, the Find Multiple Posts Intent handler is responsible for handling our user’s requests for a list of posts about a topic; for example, ‘I’m interested in Docker’. To fulfill the user request, the Find Multiple Posts Intent handler, calls the Helper module’s getPostsByTopic function, passing the topic requested and specifying a result set size of a maximum of six posts with the highest relevance scores greater than 1.0

The Find By ID Intent handler is responsible for handling our user’s requests for a specific, unique posts ID; for example, ‘Post ID 22141’. To fulfill the user request, the Find By ID Intent handler, calls the Helper module’s getPostById function, passing the unique Post ID (gist).

Entry Point

The entry point creates a way to handle the communication with Dialogflow’s fulfillment API (gist).

Helper Functions

The helper functions are part of the Helper module, contained in the helper.js file. In addition to typical utility functions like formatting dates, there are two functions, which interface with Elasticsearch, via our Spring Boot API, getPostsByTopic and getPostById. As described above, the intent handlers call one of these functions to obtain search results from Elasticsearch.

The getPostsByTopic function handles both the Find Post Intent handler and Find Multiple Posts Intent handler, described above. The only difference in the two calls is the size of the response set, either one result or six results maximum (gist).

Both functions use the request and request-promise-native npm modules to call the Spring Boot service’s RESTful API over HTTP. However, instead of returning a callback, the request-promise-native module allows us to return a native ES6 Promise. By returning a promise, we can use async/await with our Intent handlers. Using async/await with Promises is a newer way of handling asynchronous operations in Node.js. The asynchronous programming model, using promises, is described in greater detail in my previous post, Building Serverless Actions for Google Assistant with Google Cloud Functions, Cloud Datastore, and Cloud Storage.

ThegetPostById function handles both the Find By ID Intent handler and Option Intent handler, described above. This function is similar to the getPostsByTopic function, calling a Spring Boot service’s RESTful API endpoint and passing the Post ID (gist).

Cloud Function Deployment

To deploy the Cloud Function to GCP, use the gcloud CLI with the beta version of the functions deploy command. According to Google, gcloud is a part of the Google Cloud SDK. You must download and install the SDK on your system and initialize it before you can use gcloud. Currently, Cloud Functions are only available in four regions. I have included a shell scriptdeploy-cloud-function.sh, to make this step easier. It is called using the npm run deploy function. (gist).

The creation or update of the Cloud Function can take up to two minutes. Note the output indicates the environment variables, contained in the .env.yaml file, have been deployed. The URL endpoint of the function and the function’s entry point are also both output.

wp-search-031.png

If you recall, the URL endpoint of the Cloud Function is required in the Dialogflow Fulfillment tab. The URL can be retrieved from the deployment output (shown above). The Cloud Function is now deployed and will be called by the Action when a user invokes the Action.

What is Deployed

The .gcloudignore file is created the first time you deploy a new function. Using the the .gcloudignore file, you limit the files deployed to GCP. For this post, of all the files in the project, only four files, index.js, helper.js, package.js, and the PNG file used in the Action’s responses, need to be deployed. All other project files are ear-marked in the .gcloudignore file to avoid being deployed.

wp-search-038.png

Simulation Testing and Debugging

With our Action and all its dependencies deployed and configured, we can test the Action using the Simulation console on Actions on Google. According to Google, the Action Simulation console allows us to manually test our Action by simulating a variety of Google-enabled hardware devices and their settings.

Below, in the Simulation console, we see the successful display of our Programmatic Ponderings Search Action for Google Assistant containing the expected Simple Response, List, and Suggestion Chips response types, triggered by a user’s invocation of the Action.

wp-search-035

The simulated response indicates that the Google Cloud Function was called, and it responded successfully. That also indicates the Dialogflow-based Action successfully communicated with the Cloud Function, the Cloud Function successfully communicated with the Spring Boot service instances running on Google Kubernetes Engine, and finally, the Spring Boot services successfully communicated with Elasticsearch running on Google Compute Engine.

If we had issues with the testing, the Action Simulation console also contains tabs containing the request and response objects sent to and from the Cloud Function, the audio response, a debug console, any errors, and access to the logs.

Stackdriver Logging

In the log output below, from our Cloud Function, we see our Cloud Function’s activities. These activities including information log entries, which we explicitly defined in our Cloud Function using the winston and @google-cloud/logging-winston npm modules. According to Google, the author of the module, Stackdriver Logging for Winston provides an easy to use, higher-level layer (transport) for working with Stackdriver Logging, compatible with Winston. Developing an effective logging strategy is essential to maintaining and troubleshooting your code in Development, as well as Production.

wp-search-036

Conclusion

In this two-part post, we observed how the capabilities of a voice and text-based conversational interface, such as an Action for Google Assistant, may be enhanced through integration with a search and analytics engine, such as Elasticsearch. This post barely scraped the surface of what could be achieved with such an integration. Elasticsearch, as well as other leading Lucene-based search and analytics engines, such as Apache Solr, have tremendous capabilities, which are easily integrated to machine learning-based conversational interfaces, resulting in a more powerful and a more intuitive end-user experience.

All opinions expressed in this post are my own and not necessarily the views of my current or past employers, their clients, or Google.

, , , , , , , , , , , , ,

1 Comment

Integrating Search Capabilities with Actions for Google Assistant, using GKE and Elasticsearch: Part 1

Introduction

Voice and text-based conversational interfaces, such as chatbots, have recently seen tremendous growth in popularity. Much of this growth can be attributed to leading Cloud providers, such as Google, Amazon, and Microsoft, who now provide affordable, end-to-end development, machine learning-based training, and hosting platforms for conversational interfaces.

Cloud-based machine learning services greatly improve a conversational interface’s ability to interpret user intent with greater accuracy. However, the ability to return relevant responses to user inquiries, also requires interfaces have access to rich informational datastores, and the ability to quickly and efficiently query and analyze that data.

In this two-part post, we will enhance the capabilities of a voice and text-based conversational interface by integrating it with a search and analytics engine. By interfacing an Action for Google Assistant conversational interface with Elasticsearch, we will improve the Action’s ability to provide relevant results to the end-user. Instead of querying a traditional database for static responses to user intent, our Action will access a  Near Realtime (NRT) Elasticsearch index of searchable documents. The Action will leverage Elasticsearch’s advanced search and analytics capabilities to optimize and shape user responses, based on their intent.

Action Preview

Here is a brief YouTube video preview of the final Action for Google Assistant, integrated with Elasticsearch, running on an Apple iPhone.

Google Technologies

The high-level architecture of our search engine-enhanced Action for Google Assistant will look as follows.

Google Search Assistant Diagram GCP

Here is a brief overview of the key technologies we will incorporate into our architecture.

Actions on Google

According to Google, Actions on Google is the platform for developers to extend the Google Assistant. Actions on Google is a web-based platform that provides a streamlined user-experience to create, manage, and deploy Actions. We will use the Actions on Google platform to develop our Action in this post.

Dialogflow

According to Google, Dialogflow is an enterprise-grade NLU platform that makes it easy for developers to design and integrate conversational user interfaces into mobile apps, web applications, devices, and bots. Dialogflow is powered by Google’s machine learning for Natural Language Processing (NLP).

Google Cloud Functions

Google Cloud Functions are part of Google’s event-driven, serverless compute platform, part of the Google Cloud Platform (GCP). Google Cloud Functions are analogous to Amazon’s AWS Lambda and Azure Functions. Features include automatic scaling, high availability, fault tolerance, no servers to provision, manage, patch or update, and a payment model based on the function’s execution time.

Google Kubernetes Engine

Kubernetes Engine is a managed, production-ready environment, available on GCP, for deploying containerized applications. According to Google, Kubernetes Engine is a reliable, efficient, and secure way to run Kubernetes clusters in the Cloud.

Elasticsearch

Elasticsearch is a leading, distributed, RESTful search and analytics engine. Elasticsearch is a product of Elastic, the company behind the Elastic Stack, which includes Elasticsearch, Kibana, Beats, Logstash, X-Pack, and Elastic Cloud. Elasticsearch provides a distributed, multitenant-capable, full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is similar to Apache Solr in terms of features and functionality. Both Solr and Elasticsearch is based on Apache Lucene.

Other Technologies

In addition to the major technologies highlighted above, the project also relies on the following:

  • Google Container Registry – As an alternative to Docker Hub, we will store the Spring Boot API service’s Docker Image in Google Container Registry, making deployment to GKE a breeze.
  • Google Cloud Deployment Manager – Google Cloud Deployment Manager allows users to specify all the resources needed for application in a declarative format using YAML. The Elastic Stack will be deployed with Deployment Manager.
  • Google Compute Engine – Google Compute Engine delivers scalable, high-performance virtual machines (VMs) running in Google’s data centers, on their worldwide fiber network.
  • Google Stackdriver – Stackdriver aggregates metrics, logs, and events from our Cloud-based project infrastructure, for troubleshooting.  We are also integrating Stackdriver Logging for Winston into our Cloud Function for fast application feedback.
  • Google Cloud DNS – Hosts the primary project domain and subdomains for the search engine and API. Google Cloud DNS is a scalable, reliable and managed authoritative Domain Name System (DNS) service running on the same infrastructure as Google.
  • Google VPC Network FirewallFirewall rules provide fine-grain, secure access controls to our API and search engine. We will several firewall port openings to talk to the Elastic Stack.
  • Spring Boot – Pivotal’s Spring Boot project makes it easy to create stand-alone, production-grade Spring-based Java applications, such as our Spring Boot service.
  • Spring Data Elasticsearch – Pivotal Software’s Spring Data Elasticsearch project provides easy integration to Elasticsearch from our Java-based Spring Boot service.

Demonstration

To demonstrate an Action for Google Assistant with search engine integration, we need an index of content to search. In this post, we will build an informational Action, the Programmatic Ponderings Search Action, that responds to a user’s interests in certain technical topics, by returning post suggestions from the Programmatic Ponderings blog. For this demonstration, I have indexed the last two years worth of blog posts into Elasticsearch, using the ElasticPress WordPress plugin.

Source Code

All open-sourced code for this post can be found on GitHub in two repositories, one for the Spring Boot Service and one for the Action for Google Assistant. Code samples in this post are displayed as GitHub Gists, which may not display correctly on some mobile and social media browsers. Links to gists are also provided.

Development Process

This post will focus on the development and integration of the Action for Google Assistant with Elasticsearch, via a Google Cloud Function, Kubernetes Engine, and the Spring Boot API service. The post is not intended to be a general how-to on developing for Actions for Google Assistant, Google Cloud Platform, Elasticsearch, or WordPress.

Building and integrating the Action will involve the following steps:

  • Design the Action’s conversation model;
  • Provision the Elastic Stack on Google Compute Engine using Deployment Manager;
  • Create an Elasticsearch index of blog posts;
  • Provision the Kubernetes cluster on GCP with GKE;
  • Develop and deploy the Spring Boot API service to Kubernetes;

Covered in Part Two of the Post:

  • Create the new Actions project using the Actions on Google;
  • Develop the Action’s Intents using the Dialogflow;
  • Develop, deploy, and test the Cloud Function to GCP;

Let’s explore each step in more detail.

Conversational Model

The conversational model design of the Programmatic Ponderings Search Action for Google Assistant will have the option to invoke the Action in two ways, with or without intent. Below on the left, we see an example of an invocation of the Action – ‘Talk to Programmatic Ponderings’. Google Assistant then responds to the user for more information (intent) – ‘What topic are you interested in reading about?’.

sample-dialog-1.png

Below on the left, we see an invocation of the Action, which includes the intent – ‘Ask Programmatic Ponderings to find a post about Kubernetes’. Google Assistant will respond directly, both verbally and visually with the most relevant post.

sample-dialog-2

When a user requests a single result, for example, ‘Find a post about Docker’, Google Assistant will include Simple ResponseBasic Card, and Suggestion Chip response types for devices with a display. This is shown in the center, above. The user may continue to ask for additional facts or choose to cancel the Action at any time.

When a user requests multiple results, for example, ‘I’m interested in Docker’, Google Assistant will include Simple ResponseList, and Suggestion Chip response types for devices with a display. An example of a List Response is shown in the center of the previous set of screengrabs, above. The user will receive up to six results in the list, with a relevance score of 1.0 or greater. The user may choose to click on any of the post results in the list, which will initiate a new search using the post’s unique ID, as shown on the right, in the first set of screengrabs, above.

The conversational model also understands a request for help and to cancel the interaction.

GCP Account and Project

The following steps assume you have an existing GCP account and you have created a project on GCP to house the Cloud Function, GKE Cluster, and Elastic Stack on Google Compute Engine. The post also assumes that you have the latest Google Cloud SDK installed on your development machine, and have authenticated your identity from the command line (gist).

Elasticsearch on GCP

There are a number of options available to host Elasticsearch. Elastic, the company behind Elasticsearch, offers the Elasticsearch Service, a fully managed, scalable, and reliable service on AWS and GCP. AWS also offers their own managed Elasticsearch Service. I found some limitations with AWS’ Elasticsearch Service, which made integration with Spring Data Elasticsearch difficult. According to AWS, the service supports HTTP but does not support TCP transport.

For this post, we will stand up the Elastic Stack on GCP using an offering from the Google Cloud Platform Marketplace. A well-known provider of packaged applications for multiple Cloud platforms, Bitnami, offers the ELK Stack (the previous name for the Elastic Stack), running on Google Compute Engine.

wp-search-004.png

GCP Marketplace Solutions are deployed using the Google Cloud Deployment Manager.  The Bitnami ELK solution is a complete stack with all the necessary software and software-defined Cloud infrastructure to securely run Elasticsearch. You select the instance’s zone(s), machine type, boot disk size, and security and networking configurations. Using that configuration, the Deployment Manager will deploy the solution and provide you with information and credentials for accessing the Elastic Stack. For this demo, we will configure a minimally-sized, single VM instance to run the Elastic Stack.

wp-search-005.png

Below we see the Bitnami ELK stack’s components being created on GCP, by the Deployment Manager.

wp-search-006.png

Indexed Content

With the Elastic Stack fully provisioned, I then configured WordPress to index the last two years of the Programmatic Pondering blog posts to Elasticsearch on GCP. If you want to follow along with this post and content to index, there is plenty of open source and public domain indexable content available on the Internet – books, movie lists, government and weather data, online catalogs of products, and so forth. Anything in a document database is directly indexable in Elasticsearch. Elastic even provides a set of index samples, available on their GitHub site.

wp-search-009

Firewall Ports for Elasticseach

The Deployment Manager opens up firewall ports 80 and 443. To index the WordPress posts, I also had to open port 9200. According to Elastic, Elasticsearch uses port 9200 for communicating with their RESTful API with JSON over HTTP. For security, I locked down this firewall opening to my WordPress server’s address as the source. (gist).

The two existing firewall rules for port opening 80 and 443 should also be locked down to your own IP address as the source. Common Elasticsearch ports are constantly scanned by Hackers, who will quickly hijack your Elasticsearch contents and hold them for ransom, in addition to deleting your indexes. Similar tactics are used on well-known and unprotected ports for many platforms, including Redis, MySQL, PostgreSQL, MongoDB, and Microsoft SQL Server.

Kibana

Once the posts are indexed, the best way to view the resulting Elasticsearch documents is through Kibana, which is included as part of the Bitnami solution. Below we see approximately thirty posts, spread out across two years.

wp-search-010.png

Each Elasticsearch document, representing an indexed WordPress blog post, contains over 125 fields of information. Fields include a unique post ID, post title, content, publish date, excerpt, author, URL, and so forth. All these fields are exposed through Elasticsearch’s API, and as we will see,  will be available to our Spring Boot service to query.

wp-search-011.png

Spring Boot Service

To ensure decoupling between the Action for Google Assistant and Elasticsearch, we will expose a RESTful search API, written in Java using Spring Boot and Spring Data Elasticsearch. The API will expose a tailored set of flexible endpoints to the Action. Google’s machine learning services will ensure our conversational model is trained to understand user intent. The API’s query algorithm and Elasticsearch’s rich Lucene-based search features will ensure the most relevant results are returned. We will host the Spring Boot service on Google Kubernetes Engine (GKE).

Will use a Spring Rest Controller to expose our RESTful web service’s resources to our Action’s Cloud Function. The current Spring Boot service contains five /elastic resource endpoints exposed by the ElasticsearchPostController class . Of those five, two endpoints will be called by our Action in this demo, the /{id} and the /dismax-search endpoints. The endpoints can be seen using the Swagger UI. Our Spring Boot service implements SpringFox, which has the option to expose the Swagger interactive API UI.

wp-search-017.png

The /{id} endpoint accepts a unique post ID as a path variable in the API call and returns a single ElasticsearchPost object wrapped in a Map object, and serialized to a  JSON payload (gist).

Below we see an example response from the Spring Boot service to an API call to the /{id} endpoint, for post ID 22141. Since we are returning a single post, based on ID, the relevance score will always be 0.0 (gist).

This controller’s /{id} endpoint relies on a method exposed by the ElasticsearchPostRepository interface. The ElasticsearchPostRepository is a Spring Data Repository , which extends ElasticsearchRepository. The repository exposes the findById() method, which returns a single instance of the type, ElasticsearchPost, from Elasticsearch (gist).

The ElasticsearchPost class is annotated as an Elasticsearch Document, similar to other Spring Data Document annotations, such as Spring Data MongoDB. The ElasticsearchPost class is instantiated to hold deserialized JSON documents stored in ElasticSeach stores indexed data (gist).

Dis Max Query

The second API endpoint called by our Action is the /dismax-search endpoint. We use this endpoint to search for a particular post topic, such as ’Docker’. This type of search, as opposed to the Spring Data Repository method used by the /{id} endpoint, requires the use of an ElasticsearchTemplate. The ElasticsearchTemplate allows us to form more complex Elasticsearch queries than is possible using an ElasticsearchRepository class. Below, the /dismax-search endpoint accepts four input request parameters in the API call, which are the topic to search for, the starting point and size of the response to return, and the minimum relevance score (gist).

The logic to create and execute the ElasticsearchTemplate is handled by the ElasticsearchService class. The ElasticsearchPostController calls the ElasticsearchService. The ElasticsearchService handles querying Elasticsearch and returning a list of ElasticsearchPost objects to the ElasticsearchPostController. The dismaxSearch method, called by the /dismax-search endpoint’s method constructs the ElasticsearchTemplate instance, used to build the request to Elasticsearch’s RESTful API (gist).

To obtain the most relevant search results, we will use Elasticsearch’s Dis Max Query combined with the Match Phrase Query. Elastic describes the Dis Max Query as:

‘a query that generates the union of documents produced by its subqueries, and that scores each document with the maximum score for that document as produced by any subquery, plus a tie breaking increment for any additional matching subqueries.

In short, the Dis Max Query allows us to query and weight (boost importance) multiple indexed fields, across all documents. The Match Phrase Query analyzes the text (our topic) and creates a phrase query out of the analyzed text.

After some experimentation, I found the valid search results were returned by applying greater weighting (boost) to the post’s title and excerpt, followed by the post’s tags and categories, and finally, the actual text of the post. I also limited results to a minimum score of 1.0. Just because a word or phrase is repeated in a post, doesn’t mean it is indicative of the post’s subject matter. Setting a minimum score attempts to help ensure the requested topic is featured more prominently in the resulting post or posts. Increasing the minimum score will decrease the number of search results, but theoretically, increase their relevance (gist).

Below we see the results of a /dismax-search API call to our service, querying for posts about the topic, ’Istio’, with a minimum score of 2.0. The search resulted in a serialized JSON payload containing three ElasticsearchPost objects (gist).

Understanding Relevance Scoring

When returning search results, such as in the example above, the top result is the one with the highest score. The highest score should denote the most relevant result to the search query. According to Elastic, in their document titled, The Theory Behind Relevance Scoring, scoring is explained this way:

‘Lucene (and thus Elasticsearch) uses the Boolean model to find matching documents, and a formula called the practical scoring function to calculate relevance. This formula borrows concepts from term frequency/inverse document frequency and the vector space model but adds more-modern features like a coordination factor, field length normalization, and term or query clause boosting.’

In order to better understand this technical explanation of relevance scoring, it is much easy to see it applied to our example. Note the first search result above, Post ID 21867, has the highest score, 5.91989. Knowing that we are searching five fields (title, excerpt, tags, categories, and content), and boosting certain fields more than others, how was this score determined? Conveniently, Spring Data Elasticsearch’s SearchRequestBuilder class exposed the setExplain method. We can see this on line 12 of the dimaxQuery method, shown above. By passing a boolean value of true to the setExplain method, we are able to see the detailed scoring algorithms used by Elasticsearch for the top result, shown above (gist).

What this detail shows us is that of the five fields searched, the term ‘Istio’ was located in four of the five fields (all except ‘categories’). Using the practical scoring function described by Elasticsearch, and taking into account our boost values, we see that the post’s ‘excerpt’ field achieved the highest score of 5.9198895 (score of 1.6739764 * boost of 3.0).

Being able to view the scoring explanation helps us tune our search results. For example, according to the details, the term ‘Istio’ appeared 100 times (termFreq=100.0) in the main body of the post (the ‘content’ field). We might ask ourselves if we are giving enough relevance to the content as opposed to other fields. We might choose to increase the boost or decrease other fields with respect to the ‘content’ field, to produce higher quality search results.

Google Kubernetes Engine

With the Elastic Stack running on Google Compute Engine, and the Spring Boot API service built, we can now provision a Kubernetes cluster to run our Spring Boot service. The service will sit between our Action’s Cloud Function and Elasticsearch. We will use Google Kubernetes Engine (GKE) to manage our Kubernete cluster on GCP. A GKE cluster is a managed group of uniform VM instances for running Kubernetes. The VMs are managed by Google Compute Engine. Google Compute Engine delivers virtual machines running in Google’s data centers, on their worldwide fiber network.

A GKE cluster can be provisioned using GCP’s Cloud Console or using the Cloud SDK, Google’s command-line interface for Google Cloud Platform products and services. I prefer using the CLI, which helps enable DevOps automation through tools like Jenkins and Travis CI (gist).

Below is the command I used to provision a minimally sized three-node GKE cluster, replete with the latest available version of Kubernetes. Although a one-node cluster is sufficient for early-stage development, testing should be done on a multi-node cluster to ensure the service will operate properly with multiple instances running behind a load-balancer (gist).

Below, we see the three n1-standard-1 instance type worker nodes, one in each of three different specific geographical locations, referred to as zones. The three zones are in the us-east1 region. Multiple instances spread across multiple zones provide single-region high-availability for our Spring Boot service. With GKE, the Master Node is fully managed by Google.

wp-search-015

Building Service Image

In order to deploy our Spring Boot service, we must first build a Docker Image and make that image available to our Kubernetes cluster. For lowest latency, I’ve chosen to build and publish the image to Google Container Registry, in addition to Docker Hub. The Spring Boot service’s Docker image is built on the latest Debian-based OpenJDK 10 Slim base image, available on Docker Hub. The Spring Boot JAR file is copied into the image (gist).

To automate the build and publish processes with tools such as Jenkins or Travis CI, we will use a simple shell script. The script builds the Spring Boot service using Gradle, then builds the Docker Image containing the Spring Boot JAR file, tags and publishes the Docker image to the image repository, and finally, redeploys the Spring Boot service container to GKE using kubectl (gist).

Below we see the latest version of our Spring Boot Docker image published to the Google Cloud Registry.

wp-search-016

Deploying the Service

To deploy the Spring Boot service’s container to GKE, we will use a Kubernetes Deployment Controller. The Deployment Controller manages the Pods and ReplicaSets. As a deployment alternative, you could choose to use CoreOS’ Operator Framework to create an Operator or use Helm to create a Helm Chart. Along with the Deployment Controller, there is a ConfigMap and a Horizontal Pod Autoscaler. The ConfigMap contains environment variables that will be available to the Spring Boot service instances running in the Kubernetes Pods. Variables include the host and port of the Elasticsearch cluster on GCP and the name of the Elasticsearch index created by WordPress. These values will override any configuration values set in the service’s application.yml Java properties file.

The Deployment Controller creates a ReplicaSet with three Pods, running the Spring Boot service, one on each worker node (gist).

To properly load-balance the three Spring Boot service Pods, we will also deploy a Kubernetes Service of the Kubernetes ServiceType, LoadBalancer. According to Kubernetes, a Kubernetes Service is an abstraction which defines a logical set of Pods and a policy by which to access them (gist).

Below, we see three instances of the Spring Boot service deployed to the GKE cluster on GCP. Each Pod, containing an instance of the Spring Boot service, is in a load-balanced pool, behind our service load balancer, and exposed on port 80.

wp-search-014

Testing the API

We can test our API and ensure it is talking to Elasticsearch, and returning expected results using the Swagger UI, shown previously, or tools like Postman, shown below.

wp-search-018.png

Communication Between GKE and Elasticsearch

Similar to port 9200, which needed to be opened for indexing content over HTTP, we also need to open firewall port 9300 between the Spring Boot service on GKE and Elasticsearch. According to Elastic, Elasticsearch Java clients talk to the Elasticsearch cluster over port 9300, using the native Elasticsearch transport protocol (TCP).

Google Search Assistant Diagram WordPress Index

Again, locking this port down to the GKE cluster as the source is critical for security (gist).

Part Two

In part one we have examined the creation of the Elastic Stack, the provisioning of the GKE cluster, and the development and deployment of the Spring Boot service to Kubernetes. In part two of this post, we will tie everything together by creating and integrating our Action for Google Assistant:

  • Create the new Actions project using the Actions on Google console;
  • Develop the Action’s Intents using the Dialogflow console;
  • Develop, deploy, and test the Cloud Function to GCP;

Google Search Assistant Diagram part 2b.png

Related Posts

If you’re interested in comparing the development of an Action for Google Assistant with that of Amazon’s Alexa and Microsoft’s LUIS-enabled chatbots, in addition to this post, I would recommend the previous three posts in this conversation interface series:

All three article’s demonstrations leverage their respective Cloud platform’s machine learning-based Natural language understanding (NLU) services. All three take advantage of their respective Cloud platform’s NoSQL database and object storage services. Lastly, all three of the article’s demonstrations are written in a common language, Node.js.

All opinions expressed in this post are my own and not necessarily the views of my current or past employers, their clients, or Google.

, , , , , , , , , , , , ,

1 Comment

Using Eventual Consistency and Spring for Kafka to Manage a Distributed Data Model: Part 2

Given a modern distributed system, composed of multiple microservices, each possessing a sub-set of the domain’s aggregate data they need to perform their functions autonomously, we will almost assuredly have some duplication of data. Given this duplication, how do we maintain data consistency? In this two-part post, we’ve been exploring one possible solution to this challenge, using Apache Kafka and the model of eventual consistency. In Part One, we examined the online storefront domain, the storefront’s microservices, and the system’s state change event message flows.

Part Two

In Part Two of this post, I will briefly cover how to deploy and run a local development version of the storefront components, using Docker. The storefront’s microservices will be exposed through an API Gateway, Netflix’s Zuul. Service discovery and load balancing will be handled by Netflix’s Eureka. Both Zuul and Eureka are part of the Spring Cloud Netflix project. To provide operational visibility, we will add Yahoo’s Kafka Manager and Mongo Express to our system.

docker-system-diagram

Source code for deploying the Dockerized components of the online storefront, shown in this post, is available on GitHub. All Docker Images are available on Docker Hub. I have chosen the wurstmeister/kafka-docker version of Kafka, available on Docker Hub; it has 580+ stars and 10M+ pulls on Docker Hub. This version of Kafka works well, as long as you run it within a Docker Swarm, locally.

Code samples in this post are displayed as Gists, which may not display correctly on some mobile and social media browsers. Links to gists are also provided.

Deployment Options

For simplicity, I’ve used Docker’s native Docker Swarm Mode to support the deployed online storefront. Docker requires minimal configuration as opposed to other CaaS platforms. Usually, I would recommend Minikube for local development if the final destination of the storefront were Kubernetes in Production (AKS, EKS, or GKE). Alternatively, if the final destination of the storefront were Red Hat OpenShift in Production, I would recommend Minishift for local development.

Docker Deployment

We will break up our deployment into two parts. First, we will deploy everything, except our services. We will allow Kafka, MongoDB, Eureka, and the other components to startup up fully. Afterward, we will deploy the three online storefront services. The storefront-kafka-docker project on Github contains two Docker Compose files, which are divided between the two tasks.

The middleware Docker Compose file (gist).

The services Docker Compose file (gist).

In the storefront-kafka-docker project, there is a shell script, stack_deploy_local.sh. This script will execute both Docker Compose files, in succession, with a pause in between. You may need to adjust the timing for your own system (gist).

Start by running docker swarm init. This command will initialize a Docker Swarm. Next, execute the stack deploy script, using an sh ./stack_deploy_local.sh command. The script will deploy a new Docker Stack, within the Docker Swarm. The Docker Stack will hold all storefront components, deployed as individual Docker containers. The stack is deployed within its own isolated Docker overlay networkkafka-net.

Note we are not using host-based persistent storage for this local development demo. Destroying the Docker stack or the individual Kafka, Zookeeper, or MongoDB Docker containers will result in a loss of data.

stack-deploy

Before completion, the stack deploy script runs docker stack ls command, followed by a docker stack services storefront command. You should see one stack, names storefront, with ten services. You should also see each of the ten services has 1/1 replicas running, indicated everything has started or is starting correctly, without failure. A failure would be reflected here as a service having 0/1 replicas.

docker-stack-ls

Before completion, the stack deploy script also runs docker container ls command. You should observe each of the ten running containers (‘services’ in the Docker stack), along with their instance names and ports.

docker-container-ls

There is also a shell script, stack_delete_local.sh, which will issue a docker stack rm storefront command to destroy the stack when you are done.

Using the names of the storefront’s Docker containers, you can check the start-up logs of any of the components, using the docker logs command.

docker-logs

Testing the Stack

With the storefront stack deployed, we need to confirm that all the components have started correctly and are communicating with each other. To accomplish this, I’ve written a simple Python script, refresh.py. The refresh script has multiple uses. It deletes any existing storefront service MongoDB databases. It also deletes any existing Kafka topics; I call the Kafka Manager’s API to accomplish this. We have no databases or topics since our stack was just created. However, if you are actively developing your data models, you will likely want to purge the databases and topics regularly (gist).

Next, the refresh script calls a series of RESTful HTTP endpoints, in a specific order, to create sample data. Our three storefront services each expose different endpoints. The different /sample endpoints create sample customers, orders, order fulfillment requests, and shipping notifications. The create sample data endpoints include, in order:

  1. Sample Customer: /accounts/customers/sample
  2. Sample Orders: /orders/customers/sample/orders
  3. Sample Fulfillment Requests: /orders/customers/sample/fulfill
  4. Sample Processed Order Events: /fulfillment/fulfillment/sample/process
  5. Sample Shipped Order Events: /fulfillment/fulfillment/sample/ship
  6. Sample In-Transit Order Events: /fulfillment/fulfillment/sample/in-transit
  7. Sample Received Order Events: /fulfillment/fulfillment/sample/receive

You could create data on your own, by POSTing to the exposed CRUD endpoints on each service. However, given the complex data objects required in the request payloads, it is too time-consuming for this demo.

To execute the script, use a python3 ./refresh.py command. I am using Python 3 in the demo, but the script should also work with Python 2.x, if you change shebang.

refresh-script

If everything was successful, the script returns one document from each of the three storefront service’s MongoDB database collections. A result of ‘None’ for any of the MongoDB documents usually indicates one of the earlier commands failed. Given an abnormally high response latency, due to the load of the ten running containers on my laptop, I had to increase the Zuul/Ribbon timeouts.

Observing the System

We should now have the online storefront Docker stack running, three MongoDB databases created and populated with sample documents (data), and three Kafka topics, which have messages in them. Based on the fact we saw database documents printed out with our refresh script, we know the topics were used to pass data between the message producing and message consuming services.

In most enterprise environments, a developer may not the access, nor the operational knowledge to interact with Kafka or MongoDB from within a container, on the command line. So how else can we interact with the system?

Kafka Manager

Kafka Manager gives us the ability to interact with Kafka via a convenient browser-based user interface. For this demo, the Kafka Manager UI is available on default port 9000.

kafka_manager_00

To make Kafka Manager useful, define the Kafka cluster. The Cluster Name is up to you. The Cluster Zookeeper Host should be zookeeper:2181, for our demo.

kafka_manager_01

Kafka Manager gives us useful insights into many aspects of our simple, single-broker cluster. You should observe three topics, created during the deployment of Kafka.

kafka_manager_02

Kafka Manager is an appealing alternative, as opposed to connecting with the Kafka container, with a docker exec command, to interact with Kafka. A typical use case might be deleting a topic or adding partitions to a topic. We can also see which Consumers are consuming which topics, from within Kafka Manager.

kafka_manager_03

Mongo Express

Similar to Kafka Manager, Mongo Express gives us the ability to interact with Kafka via a user interface. For this demo, the Mongo Express browser-based user interface is available on default port 8081. The initial view displays each of the existing databases. Note our three service’s databases, including accounts, orders, and fulfillment.

mongo-express-01

Drilling into an individual database, we can view each of the database’s collections. Digging in further, we can interact with individual database collection documents.

mongo-express-02

We may even edit and save the documents.

mongo-express-03

SpringFox and Swagger

Each of the storefront services also implements SpringFox, the automated JSON API documentation for API’s built with Spring. With SpringFox, each service exposes a rich Swagger UI. The Swagger UI allows us to interact with service endpoints.

Since each service exposes its own Swagger interface, we must access them through the Zuul API Gateway on port 8080. In our demo environment, the Swagger browser-based user interface is accessible at /swagger-ui.html. Below, is a fully self-documented Orders service API, as seen through the Swagger UI.

I believe there are still some incompatibilities with the latest SpringFox release and Spring Boot 2, which prevents Swagger from showing the default Spring Data REST CRUD endpoints. Currently, you only see the API  endpoints you explicitly declare in your Controller classes.

swagger-ui-1

The service’s data models (POJOs) are also exposed through the Swagger UI by default. Below we see the Orders service’s models.

swagger-ui-3

The Swagger UI allows you to drill down into the complex structure of the models, such as the CustomerOrder entity, exposing each of the entity’s nested data objects.

swagger-ui-2

Spring Cloud Netflix Eureka

This post does not cover the use of Eureka or Zuul. Eureka gives us further valuable insight into our storefront system. Eureka is our systems service registry and provides load-balancing for our services if we had multiple instances.

For this demo, the Eureka browser-based user interface is available on default port 8761. Within the Eureka user interface, we should observe the three storefront services and Zuul, the API Gateway, registered with Eureka. If we had more than one instance of each service, we would see all of them listed here.

eureka-ui

Although of limited use in a local environment, we can observe some general information about our host.

eureka-ui-02

Interacting with the Services

The three storefront services are fully functional Spring Boot / Spring Data REST / Spring HATEOAS-enabled applications. Each service exposes a rich set of CRUD endpoints for interacting with the service’s data entities. Additionally, each service includes Spring Boot Actuator. Actuator exposes additional operational endpoints, allowing us to observe the running services. Again, this post is not intended to be a demonstration of Spring Boot or Spring Boot Actuator.

Using an application, such as Postman, we can interact with our service’s RESTful HTTP endpoints. Shown below, we are calling the Account service’s customers resource. The Accounts request is proxied through the Zuul API Gateway.

postman

The above Postman Storefront Collection and Postman Environment are both exported and saved with the project.

Some key endpoints to observe the entities that were created using Event-Carried State Transfer are as follows. They assume you are using localhost as a base URL.

References

Links to my GitHub projects for this post

Some additional references I found useful while authoring this post and the online storefront code:

All opinions expressed in this post are my own and not necessarily the views of my current or past employers or their clients.

 

, , , , , , ,

Leave a comment

Managing Applications Across Multiple Kubernetes Environments with Istio: Part 2

In this two-part post, we are exploring the creation of a GKE cluster, replete with the latest version of Istio, often referred to as IoK (Istio on Kubernetes). We will then deploy, perform integration testing, and promote an application across multiple environments within the cluster.

Part Two

In Part One of this post, we created a Kubernetes cluster on the Google Cloud Platform, installed Istio, provisioned a PostgreSQL database, and configured DNS for routing. Under the assumption that v1 of the Election microservice had already been released to Production, we deployed v1 to each of the three namespaces.

In Part Two of this post, we will learn how to utilize the advanced API testing capabilities of Postman and Newman to ensure v2 is ready for UAT and release to Production. We will deploy and perform integration testing of a new v2 of the Election microservice, locally on Kubernetes Minikube. Once confident v2 is functioning as intended, we will promote and test v2 across the dev, test, and uat namespaces.

Source Code

As a reminder, all source code for this post can be found on GitHub. The project’s README file contains a list of the Election microservice’s endpoints. To get started quickly, use one of the two following options (gist).

Code samples in this post are displayed as Gists, which may not display correctly on some mobile and social media browsers. Links to gists are also provided.

This project includes a kubernetes sub-directory, containing all the Kubernetes resource files and scripts necessary to recreate the example shown in the post.

Testing Locally with Minikube

Deploying to GKE, no matter how automated, takes time and resources, whether those resources are team members or just compute and system resources. Before deploying v2 of the Election service to the non-prod GKE cluster, we should ensure that it has been thoroughly tested locally. Local testing should include the following test criteria:

  1. Source code builds successfully
  2. All unit-tests pass
  3. A new Docker Image can be created from the build artifact
  4. The Service can be deployed to Kubernetes (Minikube)
  5. The deployed instance can connect to the database and execute the Liquibase changesets
  6. The deployed instance passes a minimal set of integration tests

Minikube gives us the ability to quickly iterate and test an application, as well as the Kubernetes and Istio resources required for its operation, before promoting to GKE. These resources include Kubernetes Namespaces, Secrets, Deployments, Services, Route Rules, and Istio Ingresses. Since Minikube is just that, a miniature version of our GKE cluster, we should be able to have a nearly one-to-one parity between the Kubernetes resources we apply locally and those applied to GKE. This post assumes you have the latest version of Minikube installed, and are familiar with its operation.

This project includes a minikube sub-directory, containing all the Kubernetes resource files and scripts necessary to recreate the Minikube deployment example shown in this post. The three included scripts are designed to be easily adapted to a CI/CD DevOps workflow. You may need to modify the scripts to match your environment’s configuration. Note this Minikube-deployed version of the Election service relies on the external Amazon RDS database instance.

Local Database Version

To eliminate the AWS costs, I have included a second, alternate version of the Minikube Kubernetes resource files, minikube_db_local This version deploys a single containerized PostgreSQL database instance to Minikube, as opposed to relying on the external Amazon RDS instance. Be aware, the database does not have persistent storage or an Istio sidecar proxy.

istio_100.png

Minikube Cluster

If you do not have a running Minikube cluster, create one with the minikube start command.

istio_081

Minikube allows you to use normal kubectl CLI commands to interact with the Minikube cluster. Using the kubectl get nodes command, we should see a single Minikube node running the latest Kubernetes v1.10.0.

istio_082

Istio on Minikube

Next, install Istio following Istio’s online installation instructions. A basic Istio installation on Minikube, without the additional add-ons, should only require a single Istio install script.

istio_083

If successful, you should observe a new istio-system namespace, containing the four main Istio components: istio-ca, istio-ingress, istio-mixer, and istio-pilot.

istio_084

Deploy v2 to Minikube

Next, create a Minikube Development environment, consisting of a dev Namespace, Istio Ingress, and Secret, using the part1-create-environment.sh script. Next, deploy v2 of the Election service to thedev Namespace, along with an associated Route Rule, using the part2-deploy-v2.sh script. One v2 instance should be sufficient to satisfy the testing requirements.

istio_085

Access to v2 of the Election service on Minikube is a bit different than with GKE. When routing external HTTP requests, there is no load balancer, no external public IP address, and no public DNS or subdomains. To access the single instance of v2 running on Minikube, we use the local IP address of the Minikube cluster, obtained with the minikube ip command. The access port required is the Node Port (nodePort) of the istio-ingress Service. The command is shown below (gist) and included in the part3-smoke-test.sh script.

The second part of our HTTP request routing is the same as with GKE, relying on an Istio Route Rules. The /v2/ sub-collection resource in the HTTP request URL is rewritten and routed to the v2 election Pod by the Route Rule. To confirm v2 of the Election service is running and addressable, curl the /v2/actuator/health endpoint. Spring Actuator’s /health endpoint is frequently used at the end of a CI/CD server’s deployment pipeline to confirm success. The Spring Boot application can take a few minutes to fully start up and be responsive to requests, depending on the speed of your local machine.

istio_093.png

Using the Kubernetes Dashboard, we should see our deployment of the single Election service Pod is running successfully in Minikube’s dev namespace.

istio_087

Once deployed, we run a battery of integration tests to confirm that the new v2 functionality is working as intended before deploying to GKE. In the next section of this post, we will explore the process creating and managing Postman Collections and Postman Environments, and how to automate those Collections of tests with Newman and Jenkins.

istio_088

Integration Testing

The typical reason an application is deployed to lower environments, prior to Production, is to perform application testing. Although definitions vary across organizations, testing commonly includes some or all of the following types: Integration Testing, Functional Testing, System Testing, Stress or Load Testing, Performance Testing, Security Testing, Usability Testing, Acceptance Testing, Regression Testing, Alpha and Beta Testing, and End-to-End Testing. Test teams may also refer to other testing forms, such as Whitebox (Glassbox), Blackbox Testing, Smoke, Validation, or Sanity Testing, and Happy Path Testing.

The site, softwaretestinghelp.com, defines integration testing as, ‘testing of all integrated modules to verify the combined functionality after integration is termed so. Modules are typically code modules, individual applications, client and server applications on a network, etc. This type of testing is especially relevant to client/server and distributed systems.

In this post, we are concerned that our integrated modules are functioning cohesively, primarily the Election service, Amazon RDS database, DNS, Istio Ingress, Route Rules, and the Istio sidecar Proxy. Unlike Unit Testing and Static Code Analysis (SCA), which is done pre-deployment, integration testing requires an application to be deployed and running in an environment.

Postman

I have chosen Postman, along with Newman, to execute a Collection of integration tests before promoting to the next environment. The integration tests confirm the deployed application’s name and version. The integration tests then perform a series of HTTP GET, POST, PUT, PATCH, and DELETE actions against the service’s resources. The integration tests verify a successful HTTP response code is returned, based on the type of request made.

istio_055

Postman tests are written in JavaScript, similar to other popular, modern testing frameworks. Postman offers advanced features such as test-chaining. Tests can be chained together through the use of environment variables to store response values and pass them onto to other tests. Values shared between tests are also stored in the Postman Environments. Below, we store the ID of the new candidate, the result of an HTTP POST to the /candidates endpoint. We then use the stored candidate ID in proceeding HTTP GET, PUT, and PATCH test requests to the same /candidates endpoint.

istio_056

Environment-specific variables, such as the resource host, port, and environment sub-collection resource, are abstracted and stored as key/value pairs within Postman Environments, and called through variables in the request URL and within the tests. Thus, the same Postman Collection of tests may be run against multiple environments using different Postman Environments.

istio_057

Postman Runner allows us to run multiple iterations of our Collection. We also have the option to build in delays between tests. Lastly, Postman Runner can load external JSON and CSV formatted test data, which is beyond the scope of this post.

istio_058

Postman contains a simple Run Summary UI for viewing test results.

istio_060

Test Automation

To support running tests from the command line, Postman provides Newman. According to Postman, Newman is a command-line collection runner for Postman. Newman offers the same functionality as Postman’s Collection Runner, all part of the newman CLI. Newman is Node.js module, installed globally as an npm package, npm install newman --global.

Typically, Development and Testing teams compose Postman Collections and define Postman Environments, locally. Teams run their tests locally in Postman, during their development cycle. Then, those same Postman Collections are executed from the command line, or more commonly as part of a CI/CD pipeline, such as with Jenkins.

Below, the same Collection of integration tests ran in the Postman Runner UI, are run from the command line, using Newman.

istio_061

Jenkins

Without a doubt, Jenkins is the leading open-source CI/CD automation server. The building, testing, publishing, and deployment of microservices to Kubernetes is relatively easy with Jenkins. Generally, you would build, unit-test, push a new Docker image, and then deploy your application to Kubernetes using a series of CI/CD pipelines. Below, we see examples of these pipelines using Jenkins Blue Ocean, starting with a continuous integration pipeline, which includes unit-testing and Static Code Analysis (SCA) with SonarQube.

istio_108

Followed by a pipeline to build the Docker Image, using the build artifact from the above pipeline, and pushes the Image to Docker Hub.

istio_109

The third pipeline that demonstrates building the three Kubernetes environments and deploying v1 of the Election service to the dev namespace. This pipeline is just for demonstration purposes; typically, you would separate these functions.

istio_110

Spinnaker

An alternative to Jenkins for the deployment of microservices is Spinnaker, created by Netflix. According to Netflix, ‘Spinnaker is an open source, multi-cloud continuous delivery platform for releasing software changes with high velocity and confidence.’ Spinnaker is designed to integrate easily with Jenkins, dividing responsibilities for continuous integration and delivery, with deployment. Below, Spinnaker two sample deployment pipelines, similar to Jenkins, for deploying v1 and v2 of the Election service to the non-prod GKE cluster.

spin_07

Below, Spinnaker has deployed v2 of the Election service to dev using a Highlander deployment strategy. Subsequently, Spinnaker has deployed v2 to test using a Red/Black deployment strategy, leaving the previously released v1 Server Group in place, in case a rollback is required.

spin_08

Once Spinnaker is has completed the deployment tasks, the Postman Collections of smoke and integration tests are executed by Newman, as part of another Jenkins CI/CD pipeline.

istio_101B.png

In this pipeline, a set of basic smoke tests is run first to ensure the new deployment is running properly, and then the integration tests are executed.

istio_102

In this simple example, we have a three-stage pipeline created from a Jenkinsfile (gist).

Test Results

Newman offers several options for displaying test results. For easy integration with Jenkins, Newman results can be delivered in a format that can be displayed as JUnit test reports. The JUnit test report format, XML, is a popular method of standardizing test results from different testing tools. Below is a truncated example of a test report file (gist).

Translating Newman test results to JUnit reports allows the percentage of test cases successfully executed, to be tracked over multiple deployments, a universal testing metric. Below we see the JUnit Test Reports Test Result Trend graph for a series of test runs.

istio_103

Deploying to Development

Development environments typically have a rapid turnover of application versions. Many teams use their Development environment as a continuous integration environment, where every commit that successfully builds and passes all unit tests, is deployed. The purpose of the CI deployments is to ensure build artifacts will successfully deploy through the CI/CD pipeline, start properly, and pass a basic set of smoke tests.

Other teams use the Development environments as an extension of their local Minikube environment. The Development environment will possess some or all of the required external integration points, which the Developer’s local Minikube environment may not. The goal of the Development environment is to help Developers ensure their application is functioning correctly and is ready for the Test teams to evaluate, prior to promotion to the Test environment.

Some external integration points, such as external payment gateways, customer relationship management (CRM) systems, content management systems (CMS), or data analytics engines, are often stubbed-out in lower environments. Generally, third-party providers only offer a limited number of parallel non-Production integration environments. While an application may pass through several non-prod environments, testing against all external integration points will only occur in one or two of those environments.

With v2 of the Election service ready for testing on GKE, we deploy it to the GKE cluster’s dev namespace using the part4a-deploy-v2-dev.sh script. We will also delete the previous v1 version of the Election service. Similar to the v1 deployment script, the v2 scripts perform a kube-inject command, which manually injects the Istio sidecar proxy alongside the Election service, into each election v2 Pod. The deployment script also deploys an alternate Istio Route Rule, which routes requests to api.dev.voter-demo.com/v2/* resource of v2 of the Election service.

istio_054.png

Once deployed, we run our Postman Collection of integration tests with Newman or as part of a CI/CD pipeline. In the Development environment, we may choose to run a limited set of tests for the sake of expediency, or because not all external integration points are accessible.

Promotion to Test

With local Minikube and Development environment testing complete, we promote and deploy v2 of the Election service to the Test environment, using the part4b-deploy-v2-test.sh script. In Test, we will not delete v1 of the Election service.

istio_062

Often, an organization will maintain a running copy of all versions of an application currently deployed to Production, in a lower environment. Let’s look at two scenarios where this is common. First, v1 of the Election service has an issue in Production, which needs to be confirmed and may require a hot-fix by the Development team. Validation of the v1 Production bug is often done in a lower environment. The second scenario for having both versions running in an environment is when v1 and v2 both need to co-exist in Production. Organizations frequently support multiple API versions. Cutting over an entire API user-base to a new API version is often completed over a series of releases, and requires careful coordination with API consumers.

Testing All Versions

An essential role of integration testing should be to confirm that both versions of the Election service are functioning correctly, while simultaneously running in the same namespace. For example, we want to verify traffic is routed correctly, based on the HTTP request URL, to the correct version. Another common test scenario is database schema changes. Suppose we make what we believe are backward-compatible database changes to v2 of the Election service. We should be able to prove, through testing, that both the old and new versions function correctly against the latest version of the database schema.

There are different automation strategies that could be employed to test multiple versions of an application without creating separate Collections and Environments. A simple solution would be to templatize the Environments file, and then programmatically change the Postman Environment’s version variable injected from a pipeline parameter (abridged environment file shown below).

istio_095.png

Once initial automated integration testing is complete, Test teams will typically execute additional forms of application testing if necessary, before signing off for UAT and Performance Testing to begin.

User-Acceptance Testing

With testing in the Test environments completed, we continue onto UAT. The term UAT suggest that a set of actual end-users (API consumers) of the Election service will perform their own testing. Frequently, UAT is only done for a short, fixed period of time, often with a specialized team of Testers. Issues experienced during UAT can be expensive and impact the ability to release an application to Production on-time if sign-off is delayed.

After deploying v2 of the Election service to UAT, and before opening it up to the UAT team, we would naturally want to repeat the same integration testing process we conducted in the previous Test environment. We must ensure that v2 is functioning as expected before our end-users begin their testing. This is where leveraging a tool like Jenkins makes automated integration testing more manageable and repeatable. One strategy would be to duplicate our existing Development and Test pipelines, and re-target the new pipeline to call v2 of the Election service in UAT.

istio_104.png

Again, in a JUnit report format, we can examine individual results through the Jenkins Console.

istio_105.png

We can also examine individual results from each test run using a specific build’s Console Output.

istio_106.png

Testing and Instrumentation

To fully evaluate the integration test results, you must look beyond just the percentage of test cases executed successfully. It makes little sense to release a new version of an application if it passes all functional tests, but significantly increases client response times, unnecessarily increases memory consumption or wastes other compute resources, or is grossly inefficient in the number of calls it makes to the database or third-party dependencies. Often times, integration testing uncovers potential performance bottlenecks that are incorporated into performance test plans.

Critical intelligence about the performance of the application can only be obtained through the use of logging and metrics collection and instrumentation. Istio provides this telemetry out-of-the-box with Zipkin, Jaeger, Service Graph, Fluentd, Prometheus, and Grafana. In the included Grafana Istio Dashboard below, we see the performance of v1 of the Election service, under test, in the Test environment. We can compare request and response payload size and timing, as well as request and response times to external integration points, such as our Amazon RDS database. We are able to observe the impact of individual test requests on the application and all its integration points.

istio_067

As part of integration testing, we should monitor the Amazon RDS CloudWatch metrics. CloudWatch allows us to evaluate critical database performance metrics, such as the number of concurrent database connections, CPU utilization, read and write IOPS, Memory consumption, and disk storage requirements.

istio_043

A discussion of metrics starts moving us toward load and performance testing against Production service-level agreements (SLAs). Using a similar approach to integration testing, with load and performance testing, we should be able to accurately estimate the sizing requirements our new application for Production. Load and Performance Testing helps answer questions like the type and size of compute resources are required for our GKE Production cluster and for our Amazon RDS database, or how many compute nodes and number of instances (Pods) are necessary to support the expected user-load.

All opinions expressed in this post are my own, and not necessarily the views of my current or past employers, or their clients.

, , , , , , , , , , , , , ,

2 Comments

Managing Applications Across Multiple Kubernetes Environments with Istio: Part 1

In the following two-part post, we will explore the creation of a GKE cluster, replete with the latest version of Istio, often referred to as IoK (Istio on Kubernetes). We will then deploy, perform integration testing, and promote an application across multiple environments within the cluster.

Application Environment Management

Container orchestration engines, such as Kubernetes, have revolutionized the deployment and management of microservice-based architectures. Combined with a Service Mesh, such as Istio, Kubernetes provides a secure, instrumented, enterprise-grade platform for modern, distributed applications.

One of many challenges with any platform, even one built on Kubernetes, is managing multiple application environments. Whether applications run on bare-metal, virtual machines, or within containers, deploying to and managing multiple application environments increases operational complexity.

As Agile software development practices continue to increase within organizations, the need for multiple, ephemeral, on-demand environments also grows. Traditional environments that were once only composed of Development, Test, and Production, have expanded in enterprises to include a dozen or more environments, to support the many stages of the modern software development lifecycle. Current application environments often include Continous Integration and Delivery (CI), Sandbox, Development, Integration Testing (QA), User Acceptance Testing (UAT), Staging, Performance, Production, Disaster Recovery (DR), and Hotfix. Each environment requiring its own compute, security, networking, configuration, and corresponding dependencies, such as databases and message queues.

Environments and Kubernetes

There are various infrastructure architectural patterns employed by Operations and DevOps teams to provide Kubernetes-based application environments to Development teams. One pattern consists of separate physical Kubernetes clusters. Separate clusters provide a high level of isolation. Isolation offers many advantages, including increased performance and security, the ability to tune each cluster’s compute resources to meet differing SLAs, and ensuring a reduced blast radius when things go terribly wrong. Conversely, separate clusters often result in increased infrastructure costs and operational overhead, and complex deployment strategies. This pattern is often seen in heavily regulated, compliance-driven organizations, where security, auditability, and separation of duties are paramount.

Kube Clusters Diagram F15

Namespaces

An alternative to separate physical Kubernetes clusters is virtual clusters. Virtual clusters are created using Kubernetes Namespaces. According to Kubernetes documentation, ‘Kubernetes supports multiple virtual clusters backed by the same physical cluster. These virtual clusters are called namespaces’.

In most enterprises, Operations and DevOps teams deliver a combination of both virtual and physical Kubernetes clusters. For example, lower environments, such as those used for Development, Test, and UAT, often reside on the same physical cluster, each in a separate virtual cluster (namespace). At the same time, environments such as Performance, Staging, Production, and DR, often require the level of isolation only achievable with physical Kubernetes clusters.

In the Cloud, physical clusters may be further isolated and secured using separate cloud accounts. For example, with AWS you might have a Non-Production AWS account and a Production AWS account, both managed by an AWS Organization.

Kube Clusters Diagram v2 F3

In a multi-environment scenario, a single physical cluster would contain multiple namespaces, into which separate versions of an application or applications are independently deployed, accessed, and tested. Below we see a simple example of a single Kubernetes non-prod cluster on the left, containing multiple versions of different microservices, deployed across three namespaces. You would likely see this type of deployment pattern as applications are deployed, tested, and promoted across lower environments, before being released to Production.

Kube Clusters Diagram v2 F5.png

Example Application

To demonstrate the promotion and testing of an application across multiple environments, we will use a simple election-themed microservice, developed for a previous post, Developing Cloud-Native Data-Centric Spring Boot Applications for Pivotal Cloud Foundry. The Spring Boot-based application allows API consumers to create, read, update, and delete, candidates, elections, and votes, through an exposed set of resources, accessed via RESTful endpoints.

Source Code

All source code for this post can be found on GitHub. The project’s README file contains a list of the Election microservice’s endpoints. To get started quickly, use one of the two following options (gist).

Code samples in this post are displayed as Gists, which may not display correctly on some mobile and social media browsers. Links to gists are also provided.

This project includes a kubernetes sub-directory, containing all the Kubernetes resource files and scripts necessary to recreate the example shown in the post. The scripts are designed to be easily adapted to a CI/CD DevOps workflow. You will need to modify the script’s variables to match your own environment’s configuration.

istio_107small

Database

The post’s Spring Boot application relies on a PostgreSQL database. In the previous post, ElephantSQL was used to host the PostgreSQL instance. This time, I have used Amazon RDS for PostgreSQL. Amazon RDS for PostgreSQL and ElephantSQL are equivalent choices. For simplicity, you might also consider a containerized version of PostgreSQL, managed as part of your Kubernetes environment.

Ideally, each environment should have a separate database instance. Separate database instances provide better isolation, fine-grained RBAC, easier test data lifecycle management, and improved performance. Although, for this post, I suggest a single, shared, minimally-sized RDS instance.

The PostgreSQL database’s sensitive connection information, including database URL, username, and password, are stored as Kubernetes Secrets, one secret for each namespace, and accessed by the Kubernetes Deployment controllers.

istio_043.png

Istio

Although not required, Istio makes the task of managing multiple virtual and physical clusters significantly easier. Following Istio’s online installation instructions, download and install Istio 0.7.1.

To create a Google Kubernetes Engine (GKE) cluster with Istio, you could use gcloud CLI’s container clusters create command, followed by installing Istio manually using Istio’s supplied Kubernetes resource files. This was the method used in the previous post, Deploying and Configuring Istio on Google Kubernetes Engine (GKE).

Alternatively, you could use Istio’s Google Cloud Platform (GCP) Deployment Manager files, along with the gcloud CLI’s deployment-manager deployments create command to create a Kubernetes cluster, replete with Istio, in a single step. Although arguably simpler, the deployment-manager method does not provide the same level of fine-grain control over cluster configuration as the container clusters create method. For this post, the deployment-manager method will suffice.

istio_001

The latest version of the Google Kubernetes Engine, available at the time of this post, is 1.9.6-gke.0. However, to install this version of Kubernetes Engine using the Istio’s supplied deployment Manager Jinja template requires updating the hardcoded value in the istio-cluster.jinja file from 1.9.2-gke.1. This has been updated in the next release of Istio.

istio_002

Another change, the latest version of Istio offered as an option in the istio-cluster-jinja.schema file. Specifically, the installIstioRelease configuration variable is only 0.6.0. The template does not include 0.7.1 as an option. Modify the istio-cluster-jinja.schema file to include the choice of 0.7.1. Optionally, I also set 0.7.1 as the default. This change should also be included in the next version of Istio.

istio_075.png

There are a limited number of GKE and Istio configuration defaults defined in the istio-cluster.yaml file, all of which can be overridden from the command line.

istio_002B.png

To optimize the cluster, and keep compute costs to a minimum, I have overridden several of the default configuration values using the properties flag with the gcloud CLI’s deployment-manager deployments create command. The README file provided by Istio explains how to use this feature. Configuration changes include the name of the cluster, the version of Istio (0.7.1), the number of nodes (2), the GCP zone (us-east1-b), and the node instance type (n1-standard-1). I also disabled automatic sidecar injection and chose not to install the Istio sample book application onto the cluster (gist).

Cluster Provisioning

To provision the GKE cluster and deploy Istio, first modify the variables in the part1-create-gke-cluster.sh file (shown above), then execute the script. The script also retrieves your cluster’s credentials, to enable command line interaction with the cluster using the kubectl CLI.

istio_002C.png

Once complete, validate the version of Istio by examining Istio’s Docker image versions, using the following command (gist).

The result should be a list of Istio 0.7.1 Docker images.

istio_076.png

The new cluster should be running GKE version 1.9.6.gke.0. This can be confirmed using the following command (gist).

Or, from the GCP Cloud Console.

istio_037

The new GKE cluster should be composed of (2) n1-standard-1 nodes, running in the us-east-1b zone.

istio_038

As part of the deployment, all of the separate Istio components should be running within the istio-system namespace.

istio_040

As part of the deployment, an external IP address and a load balancer were provisioned by GCP and associated with the Istio Ingress. GCP’s Deployment Manager should have also created the necessary firewall rules for cluster ingress and egress.

istio_010.png

Building the Environments

Next, we will create three namespaces,dev, test, and uat, which represent three non-production environments. Each environment consists of a Kubernetes Namespace, Istio Ingress, and Secret. The three environments are deployed using the part2-create-environments.sh script.

istio_048.png

Deploying Election v1

For this demonstration, we will assume v1 of the Election service has been previously promoted, tested, and released to Production. Hence, we would expect v1 to be deployed to each of the lower environments. Additionally, a new v2 of the Election service has been developed and tested locally using Minikube. It is ready for deployment to the three environments and will undergo integration testing (detailed in Part Two of the post).

If you recall from our GKE/Istio configuration, we chose manual sidecar injection of the Istio proxy. Therefore, all election deployment scripts perform a kube-inject command. To connect to our external Amazon RDS database, this kube-inject command requires the includeIPRanges flag, which contains two cluster configuration values, the cluster’s IPv4 CIDR (clusterIpv4Cidr) and the service’s IPv4 CIDR (servicesIpv4Cidr).

Before deployment, we export the includeIPRanges value as an environment variable, which will be used by the deployment scripts, using the following command, export IP_RANGES=$(sh ./get-cluster-ip-ranges.sh). The get-cluster-ip-ranges.sh script is shown below (gist).

Using this method with manual sidecar injection is discussed in the previous post, Deploying and Configuring Istio on Google Kubernetes Engine (GKE).

To deploy v1 of the Election service to all three namespaces, execute the part3-deploy-v1-all-envs.sh script.

istio_051.png

We should now have two instances of v1 of the Election service, running in the dev, test, and uat namespaces, for a total of six election-v1 Kubernetes Pods.

istio_052

HTTP Request Routing

Before deploying additional versions of the Election service in Part Two of this post, we should understand how external HTTP requests will be routed to different versions of the Election service, in multiple namespaces. In the post’s simple example, we have a matrix of three namespaces and two versions of the Election service. That means we need a method to route external traffic to up to six different election versions. There multiple ways to solve this problem, each with their own pros and cons. For this post, I found a combination of DNS and HTTP request rewriting is most effective.

DNS

First, to route external HTTP requests to the correct namespace, we will use subdomains. Using my current DNS management solution, Azure DNS, I create three new A records for my registered domain, voter-demo.com. There is one A record for each namespace, including api.dev, api.test, and api.uat.

istio_077.png

All three subdomains should resolve to the single external IP address assigned to the cluster’s load balancer.

istio_010.png

As part of the environments creation, the script deployed an Istio Ingress, one to each environment. The ingress accepts traffic based on a match to the Request URL (gist).

The istio-ingress service load balancer, running in the istio-system namespace, routes inbound external traffic, based on the Request URL, to the Istio Ingress in the appropriate namespace.

istio_053.png

The Istio Ingress in the namespace then directs the traffic to one of the Kubernetes Pods, containing the Election service and the Istio sidecar proxy.

istio_068.png

HTTP Rewrite

To direct the HTTP request to v1 or v2 of the Election service, an Istio Route Rule is used. As part of the environment creation, along with a Namespace and Ingress resources, we also deployed an Istio Route Rule to each environment. This particular route rule examines the HTTP request URL for a /v1/ or /v2/ sub-collection resource. If it finds the sub-collection resource, it performs a HTTPRewrite, removing the sub-collection resource from the HTTP request. The Route Rule then directs the HTTP request to the appropriate version of the Election service, v1 or v2 (gist).

According to Istio, ‘if there are multiple registered instances with the specified tag(s), they will be routed to based on the load balancing policy (algorithm) configured for the service (round-robin by default).’ We are using the default load balancing algorithm to distribute requests across multiple copies of each Election service.

The final external HTTP request routing for the Election service in the Non-Production GKE cluster is shown on the left, in the diagram, below. Every Election service Pod also contains an Istio sidecar proxy instance.

Kube Clusters Diagram F14

Below are some examples of HTTP GET requests that would be successfully routed to our Election service, using the above-described routing strategy (gist).

Part Two

In Part One of this post, we created the Kubernetes cluster on the Google Cloud Platform, installed Istio, provisioned a PostgreSQL database, and configured DNS for routing. Under the assumption that v1 of the Election microservice had already been released to Production, we deployed v1 to each of the three namespaces.

In Part Two of this post, we will learn how to utilize the sophisticated API testing capabilities of Postman and Newman to ensure v2 is ready for UAT and release to Production. We will deploy and perform integration testing of a new, v2 of the Election microservice, locally, on Kubernetes Minikube. Once we are confident v2 is functioning as intended, we will promote and test v2, across the dev, test, and uat namespaces.

All opinions expressed in this post are my own, and not necessarily the views of my current or past employers, or their clients.

, , , , , , , , , , ,

1 Comment

First Impressions of AKS, Azure’s New Managed Kubernetes Container Service

Kubernetes as a Service

On October 24, 2017, less than a month prior to writing this post, Microsoft released the public preview of Managed Kubernetes for Azure Container Service (AKS). According to Microsoft, the goal of AKS is to simplify the deployment, management, and operations of Kubernetes. According to PM Lead, Containers @ Microsoft Azure, in a blog post, AKS ‘features an Azure-hosted control plane, automated upgrades, self-healing, easy scaling.’ Monroy goes on to say, ‘with AKS, customers get the benefit of open source Kubernetes without complexity and operational overhead.

Unquestionably, Kubernetes has become the leading Container-as-a-Service (CaaS) choice, at least for now. Along with the release of AKS by Microsoft, there have been other recent announcements, which reinforce Kubernetes dominance. In late September, Rancher Labs announced the release of Rancher 2.0. According to Rancher, Rancher 2.0 would be based on Kubernetes. In mid-October, at DockerCon Europe 2017, Docker announced they were integrating Kubernetes into the Docker platform. Even AWS seems to be warming up to Kubernetes, despite their own ECS, according to sources. There are rumors AWS will announce a Kubernetes offering at AWS re:Invent 2017, starting a week from now.

Previewing AKS

Being a big fan of both Azure and Kubernetes, I decided to give AKS a try. I chose to deploy an existing, simple, multi-tier web application, which I had used in several previous posts, including Eventual Consistency: Decoupling Microservices with Spring AMQP and RabbitMQ. All the code used for this post is available on GitHub.

Sample Application

The application, the Voter application, is composed of an AngularJS frontend client-side UI, and two Java Spring Boot microservices, both backed by individual MongoDB databases, and fronted with an HAProxy-based API Gateway. The AngularJS UI calls the API Gateway, which in turn calls the Spring services. The two microservices communicate with each other using HTTP-based inter-process communication (IPC). Although I would prefer event-based service-to-service IPC, HTTP-based IPC was simpler to implement, for this post.

AKS v4

Interestingly, the Voter application was designed using Docker Community Edition for Mac and deployed to AWS using Docker Community Edition for AWS. Not only would this be my chance to preview AKS, but also an opportunity to compare the ease of developing for Docker CE on AWS using a Mac, to developing for Kubernetes with AKS using Docker Community Edition for Windows.

Required Software

In order to develop for AKS on my Windows 10 Enterprise workstation, I first made sure I had the latest copies of the following software:

If you are following along with the post, make sure you have the latest version of the Azure CLI, minimally 2.0.21, according to the Azure CLI release notes. Also, I happen to be running the latest version of Docker CE from the Edge Channel. However, either channel’s latest release of Docker CE for Windows should work for this post. Using PowerShell is optional. I prefer PowerShell over working from the Windows Command Prompt, if for nothing else than to preserve my command history, by default.

AKS_V2_04

Kubernetes Resources with Kompose

Originally developed for Docker CE, the Voter application stack was defined in a single Docker Compose file.

To work on AKS, the application stack’s configuration needs to be reproduced as Kubernetes configuration files. Instead of writing the configuration files manually, I chose to use kompose. Kompose is described on its website as ‘a conversion tool for Docker Compose to container orchestrators such as Kubernetes.’ Using kompose, I was able to automatically convert the Docker Compose file into analogous Kubernetes resource configuration files.

kompose convert -f docker-compose.yml

Each Docker service in the Docker Compose file was translated into a separate Kubernetes Deployment resource configuration file, as well as a corresponding Service resource configuration file.

AKS_Demo_08

For the AngularJS Client Service and the HAProxy API Gateway Service, I had to modify the Service configuration files to switch the Service type to a Load Balancer (type: LoadBalancer). Being a Load Balancer, Kubernetes will assign a publically accessible IP address to each Service; the reasons for which are explained later in the post.

The MongoDB service requires a persistent storage volume. To accomplish this with Kubernetes, kompose created a PersistentVolumeClaims resource configuration file. I did have to create a corresponding PersistentVolume resource configuration file. It was also necessary to modify the PersistentVolumeClaims resource configuration file, specifying the Storage Class Name as manual, to correspond to the AKS Storage Class configuration (storageClassName: manual).

From the original Docker Compose file, containing five Docker services, I ended up with a dozen individual Kubernetes resource configuration files. Individual configuration files are optimal for fine-grain management of Kubernetes resources. The Docker Compose file and the Kubernetes resource configuration files are included in the GitHub project.

git clone \
  --branch master --single-branch --depth 1 --no-tags \
  https://github.com/garystafford/azure-aks-demo.git

Creating AKS Resources

New AKS Feature Flag

According to Microsoft, to start with AKS, while still a preview, creating new clusters requires a feature flag on your subscription.

az provider register -n Microsoft.ContainerService

Using a brand new Azure account for this demo, I also needed to activate two additional feature flags.

az provider register -n Microsoft.Network
az provider register -n Microsoft.Compute

If you are missing required features flags, you will see errors, similar to. the below error.

Operation failed with status: ’Bad Request’. Details: Required resource provider registrations Microsoft.Compute,Microsoft.Network are missing.

Resource Group

AKS requires an Azure Resource Group for AKS. I chose to create a new Resource Group, using the Azure CLI.

az group create \
  --resource-group resource_group_name_goes_here \
  --location eastus

AKS_Demo_01

New Kubernetes Cluster

Using the aks feature of the Azure CLI version 2.0.21 or later, I provisioned a new Kubernetes cluster. By default, Azure will create a 3-node cluster. You can override the default number of nodes using the --node-count parameter; I chose one node. The version of Kubernetes you choose is also configurable using the --kubernetes-version parameter. I selected the latest Kubernetes version available with AKS, 1.8.2.

az aks create \
  --name cluser_name_goes_here \
  --resource-group resource_group_name_goes_here \
  --node-count 1 \
  --generate-ssh-keys \
  --kubernetes-version 1.8.2

AKS_Demo_03

The newly created Azure Resource Group and AKS Kubernetes Cluster were both then visible on the Azure Portal.

AKS_Demo_04b

In addition to the new Resource Group I created, Azure also created a second Resource Group containing seven Azure resources. These Azure resources include a Virtual Machine (the single Kubernetes node), Network Security Group, Network Interface, Virtual Network, Route Table, Disk, and an Availability Group.

AKS_Demo_05b

With AKS up and running, I used another Azure CLI command to create a proxy connection to the Kubernetes Dashboard, which was deployed automatically and was running within the new AKS Cluster. The Kubernetes Dashboard is a general purpose, web-based UI for Kubernetes clusters.

az aks browse \
  --name cluser_name_goes_here \
  --resource-group resource_group_name_goes_here

AKS_Demo_23

Although no applications were deployed to AKS, yet, there were several Kubernetes components running within the AKS Cluster. Kubernetes components running within the kube-system Namespace, included heapster, kube-dns, kubernetes-dashboard, kube-proxy, kube-svc-redirect, and tunnelfront.

AKS_Demo_06B

Deploying the Application

MongoDB should be deployed first. Both the Voter and Candidate microservices depend on MongoDB. MongoDB is composed of four Kubernetes resources, a Deployment resource, Service resource, PersistentVolumeClaim resource, and PersistentVolume resource. I used kubectl, the command line interface for running commands against Kubernetes clusters, to create the four MongoDB resources, from the configuration files.

kubectl create \
  -f voter-data-vol-persistentvolume.yaml \
  -f voter-data-vol-persistentvolumeclaim.yaml \
  -f mongodb-deployment.yaml \
  -f mongodb-service.yaml

AKS_Demo_09

After MongoDB was deployed and running, I created the four remaining Deployment resources, Client, Gateway, Voter, and Candidate, from the Deployment resource configuration files. According to Kubernetes, ‘a Deployment controller provides declarative updates for Pods and ReplicaSets. You describe a desired state in a Deployment object, and the Deployment controller changes the actual state to the desired state at a controlled rate.

AKS_Demo_10

Lastly, I created the remaining Service resources from the Service resource configuration files. According to Kubernetes, ‘a Service is an abstraction which defines a logical set of Pods and a policy by which to access them.

AKS_Demo_11

Switching back to the Kubernetes Dashboard, the Voter application components were now visible.

AKS_Demo_13

There were five Kubernetes Pods, one for each application component. Since there is only one Node in the Kubernetes Cluster, all five Pods were deployed to the same Node. There were also five corresponding Kubernetes Deployments.

AKS_Demo_14

Similarly, there were five corresponding Kubernetes ReplicaSets, the next-generation Replication Controller. There were also five corresponding Kubernetes Services. Note the Gateway and Client Services have an External Endpoint (External IP) associated with them. The IPs were created as a result of adding the Load Balancer Service type to their Service resource configuration files, mentioned earlier.

AKS_Demo_15.PNG

Lastly, note the Persistent Disk Claim for MongoDB, which had been successfully bound.

AKS_Demo_16

Switching back to the Azure Portal, following the application deployment, there were now three additional resources in the AKS Resource Group, a new Azure Load Balancer and two new Public IP Addresses. The Load Balancer is used to balance the Client and Gateway Services, which both have public IP addresses.

AKS_Demo_21b

To confirm the Gateway, Voter, and Candidate Services were reachable, using the public IP address of the Gateway Service, I browsed to HAProxy’s Statistics web page. Note the two backends, candidate and voter. The green color means HAProxy was able to successfully connect to both of these Services.

AKS_Demo_12

Accessing the Application

The Voter application’s AngularJS UI frontend can be accessed using the Client Service’s public IP address. However, this would not be very user-friendly. Even if I brought up the UI, using the public IP, the UI would be unable to connect to the HAProxy API Gateway, and subsequently, the Voter or Candidate Services. Based on the Client’s current configuration, the Client is expecting to find the Gateway at api.voter-demo.com:8080.

To make accessing the Client more user-friendly, and to ensure the Client has access to the Gateway, I provisioned an Azure DNS Zone resource for my domain, voter-demo.com. I assigned the DNS Zone to the AKS Resource Group.

AKS_Demo_20b

Within the new DNS Zone, I created three DNS records. The first record, an Alias (A) record, associated voter-demo.com with the public IP address of the Client Service. I added a second Alias (A) record for the www subdomain, also associating it with the public IP address of the Client Service. The third Alias (A) record associated the api subdomain with the public IP address of the Gateway Service.

AKS_Demo_18b

At a high level, the voter application’s routing architecture looks as follows. The client’s requests to the primary domain or to the api subdomain are resolved to one of the two public IP addresses configured in the load balancer’s frontend. The requests are passed to the load balancer’s backend pool, containing the single Azure VM, which is the single Kubernetes node, and onto the client or gateway Kubernetes Service. From there, requests are routed to one the appropriate Kubernetes Pods, containing the containerized application components, client or gateway.

kub-aks

Browsing to either http://voter-demo.com or http://www.voter-demo.com should bring up the Voter app UI (oh look, Hillary won this time…).

Mobile_App_View

Using Chrome’s Developer Tools, observe when a new vote is placed, an HTTP POST is made to the gateway, on the /voter/votes endpoint, http://api.voter-demo.com:8080/voter/votes. The Gateway then proxies this request to the Voter Service, at http://voter:8080/voter/votes. Since the Gateway and Voter Services both run within the same Cluster, the Gateway is able to use the Voter service’s name to address it, using Kubernetes kube-dns.

AKS_Demo_19b

Conclusion

In the past, I have developed, deployed, and managed containerized applications, using Rancher, AWS, AWS ECS, native Kubernetes, RedHat OpenShift, Docker Enterprise Edition, and Docker Community Edition. Based on my experience, and given my limited testing with Azure’s public preview of AKS, I am very impressed. Creating the Kubernetes Cluster could not have been easier. Scaling the Cluster, adding Persistent Volumes, and upgrading Kubernetes, is equally as easy. Conveniently, AKS integrates with other Kubernetes tools, like kubectl, kompose, and Helm.

I did run into some minor issues with the AKS Preview, such as being unable to connect to an earlier Cluster, after upgrading from the default Kubernetes version to 1.82. I also experience frequent disconnects when proxying to the Kubernetes Dashboard. I am sure the AKS Preview bugs will be worked out by the time AKS is officially released.

In addition to the many advantages of Kubernetes as a CaaS, a huge advantage of using AKS is the ability to easily integrate Azure’s many enterprise-grade compute, networking, database, caching, storage, and messaging resources. It would require minimal effort to swap out the Voter application’s single containerized version of MongoDB with a highly performant and available instance of Azure Cosmos DB. Similarly, it would be relatively easy to swap out the single containerized version of HAProxy with a fully-featured and secure instance of Azure API Management. The current version of the Voter application replies on RabbitMQ for service-to-service IPC versus this earlier application version’s reliance on HTTP-based IPC. It would be fairly simple to swap RabbitMQ for Azure Service Bus.

Lastly, AKS easily integrates with leading Development and DevOps tooling and processes. Building, managing, and deploying applications to AKS, is possible with Visual Studio, VSTS, Jenkins, Terraform, and Chef, according to Microsoft.

References

A few good references to get started with AKS:

All opinions in this post are my own, and not necessarily the views of my current or past employers or their clients.

, , , , , , , ,

2 Comments