Posts Tagged Spring Boot

Building a Microservices Platform with Confluent Cloud, MongoDB Atlas, Istio, and Google Kubernetes Engine

Leading SaaS providers have sufficiently matured the integration capabilities of their product offerings to a point where it is now reasonable for enterprises to architect multi-vendor, single- and multi-cloud Production platforms, without re-engineering existing cloud-native applications. In previous posts, we have integrated other SaaS products, including as MongoDB Atlas fully-managed MongoDB-as-a-service, ElephantSQL fully-manage PostgreSQL-as-a-service, and CloudAMQP RabbitMQ-as-a-service, into cloud-native applications on Azure, AWS, GCP, and PCF.

In this post, we will build and deploy an existing, Spring Framework, microservice-based, cloud-native API to Google Kubernetes Engine (GKE), replete with Istio 1.0, on Google Cloud Platform (GCP). The API will rely on Confluent Cloud to provide a fully-managed, Kafka-based messaging-as-a-service (MaaS). Similarly, the API will rely on MongoDB Atlas to provide a fully-managed, MongoDB-based Database-as-a-service (DBaaS).

Background

In a previous two-part post, Using Eventual Consistency and Spring for Kafka to Manage a Distributed Data Model: Part 1 and Part 2, we examined the role of Apache Kafka in an event-driven, eventually consistent, distributed system architecture. The system, an online storefront RESTful API simulation, was composed of multiple, Java Spring Boot microservices, each with their own MongoDB database. The microservices used a publish/subscribe model to communicate with each other using Kafka-based messaging. The Spring services were built using the Spring for Apache Kafka and Spring Data MongoDB projects.

Given the use case of placing an order through the Storefront API, we examined the interactions of three microservices, the Accounts, Fulfillment, and Orders service. We examined how the three services used Kafka to communicate state changes to each other, in a fully-decoupled manner.

The Storefront API’s microservices were managed behind an API Gateway, Netflix’s Zuul. Service discovery and load balancing were handled by Netflix’s Eureka. Both Zuul and Eureka are part of the Spring Cloud Netflix project. In that post, the entire containerized system was deployed to Docker Swarm.

Kafka-Eventual-Cons-Swarm.png

Developing the services, not operationalizing the platform, was the primary objective of the previous post.

Featured Technologies

The following technologies are featured prominently in this post.

Confluent Cloud

confluent_cloud_apache-300x228

In May 2018, Google announced a partnership with Confluence to provide Confluent Cloud on GCP, a managed Apache Kafka solution for the Google Cloud Platform. Confluent, founded by the creators of Kafka, Jay Kreps, Neha Narkhede, and Jun Rao, is known for their commercial, Kafka-based streaming platform for the Enterprise.

Confluent Cloud is a fully-managed, cloud-based streaming service based on Apache Kafka. Confluent Cloud delivers a low-latency, resilient, scalable streaming service, deployable in minutes. Confluent deploys, upgrades, and maintains your Kafka clusters. Confluent Cloud is currently available on both AWS and GCP.

Confluent Cloud offers two plans, Professional and Enterprise. The Professional plan is optimized for projects under development, and for smaller organizations and applications. Professional plan rates for Confluent Cloud start at $0.55/hour. The Enterprise plan adds full enterprise capabilities such as service-level agreements (SLAs) with a 99.95% uptime and virtual private cloud (VPC) peering. The limitations and supported features of both plans are detailed, here.

MongoDB Atlas

mongodb

Similar to Confluent Cloud, MongoDB Atlas is a fully-managed MongoDB-as-a-Service, available on AWS, Azure, and GCP. Atlas, a mature SaaS product, offers high-availability, uptime SLAs, elastic scalability, cross-region replication, enterprise-grade security, LDAP integration, BI Connector, and much more.

MongoDB Atlas currently offers four pricing plans, Free, Basic, Pro, and Enterprise. Plans range from the smallest, M0-sized MongoDB cluster, with shared RAM and 512 MB storage, up to the massive M400 MongoDB cluster, with 488 GB of RAM and 3 TB of storage.

MongoDB Atlas has been featured in several past posts, including Deploying and Configuring Istio on Google Kubernetes Engine (GKE) and Developing Applications for the Cloud with Azure App Services and MongoDB Atlas.

Kubernetes Engine

gkeAccording to Google, Google Kubernetes Engine (GKE) provides a fully-managed, production-ready Kubernetes environment for deploying, managing, and scaling your containerized applications using Google infrastructure. GKE consists of multiple Google Compute Engine instances, grouped together to form a cluster.

A forerunner to other managed Kubernetes platforms, like EKS (AWS), AKS (Azure), PKS (Pivotal), and IBM Cloud Kubernetes Service, GKE launched publicly in 2015. GKE was built on Google’s experience of running hyper-scale services like Gmail and YouTube in containers for over 12 years.

GKE’s pricing is based on a pay-as-you-go, per-second-billing plan, with no up-front or termination fees, similar to Confluent Cloud and MongoDB Atlas. Cluster sizes range from 1 – 1,000 nodes. Node machine types may be optimized for standard workloads, CPU, memory, GPU, or high-availability. Compute power ranges from 1 – 96 vCPUs and memory from 1 – 624 GB of RAM.

Demonstration

In this post, we will deploy the three Storefront API microservices to a GKE cluster on GCP. Confluent Cloud on GCP will replace the previous Docker-based Kafka implementation. Similarly, MongoDB Atlas will replace the previous Docker-based MongoDB implementation.

ConfluentCloud-v3a.png

Kubernetes and Istio 1.0 will replace Netflix’s Zuul and  Eureka for API management, load-balancing, routing, and service discovery. Google Stackdriver will provide logging and monitoring. Docker Images for the services will be stored in Google Container Registry. Although not fully operationalized, the Storefront API will be closer to a Production-like platform, than previously demonstrated on Docker Swarm.

ConfluentCloudRouting.png

For brevity, we will not enable standard API security features like HTTPS, OAuth for authentication, and request quotas and throttling, all of which are essential in Production. Nor, will we integrate a full lifecycle API management tool, like Google Apigee.

Source Code

The source code for this demonstration is contained in four separate GitHub repositories, storefront-kafka-dockerstorefront-demo-accounts, storefront-demo-orders, and, storefront-demo-fulfillment. However, since the Docker Images for the three storefront services are available on Docker Hub, it is only necessary to clone the storefront-kafka-docker project. This project contains all the code to deploy and configure the GKE cluster and Kubernetes resources (gist).

Source code samples in this post are displayed as GitHub Gists, which may not display correctly on all mobile and social media browsers.

Setup Process

The setup of the Storefront API platform is divided into a few logical steps:

  1. Create the MongoDB Atlas cluster;
  2. Create the Confluent Cloud Kafka cluster;
  3. Create Kafka topics;
  4. Modify the Kubernetes resources;
  5. Modify the microservices to support Confluent Cloud configuration;
  6. Create the GKE cluster with Istio on GCP;
  7. Apply the Kubernetes resources to the GKE cluster;
  8. Test the Storefront API, Kafka, and MongoDB are functioning properly;

MongoDB Atlas Cluster

This post assumes you already have a MongoDB Atlas account and an existing project created. MongoDB Atlas accounts are free to set up if you do not already have one. Account creation does require the use of a Credit Card.

For minimal latency, we will be creating the MongoDB Atlas, Confluent Cloud Kafka, and GKE clusters, all on the Google Cloud Platform’s us-central1 Region. Available GCP Regions and Zones for MongoDB Atlas, Confluent Cloud, and GKE, vary, based on multiple factors.

screen_shot_2018-12-23_at_6.48.12_pm

For this demo, I suggest creating a free, M0-sized MongoDB cluster. The M0-sized 3-data node cluster, with shared RAM and 512 MB of storage, and currently running MongoDB 4.0.4, is fine for individual development. The us-central1 Region is the only available US Region for the free-tier M0-cluster on GCP. An M0-sized Atlas cluster may take between 7-10 minutes to provision.

screen_shot_2018-12-23_at_6.49.24_pm

MongoDB Atlas’ Web-based management console provides convenient links to cluster details, metrics, alerts, and documentation.

screen_shot_2018-12-23_at_6.51.41_pm

Once the cluster is ready, you can review details about the cluster and each individual cluster node.

screen_shot_2018-12-23_at_6.51.54_pm

In addition to the account owner, create a demo_user account. This account will be used to authenticate and connect with the MongoDB databases from the storefront services. For this demo, we will use the same, single user account for all three services. In Production, you would most likely have individual users for each service.

screen_shot_2018-12-23_at_6.52.18_pm

Again, for security purposes, Atlas requires you to whitelist the IP address or CIDR block from which the storefront services will connect to the cluster. For now, open the access to your specific IP address using whatsmyip.com, or much less-securely, to all IP addresses (0.0.0.0/0). Once the GKE cluster and external static IP addresses are created, make sure to come back and update this value; do not leave this wide open to the Internet.

screen_shot_2018-12-23_at_6.52.36_pm

The Java Spring Boot storefront services use a Spring Profile, gke. According to Spring, Spring Profiles provide a way to segregate parts of your application configuration and make it available only in certain environments. The gke Spring Profile’s configuration values may be set in a number of ways. For this demo, the majority of the values will be set using Kubernetes Deployment, ConfigMap and Secret resources, shown later.

The first two Spring configuration values will need are the MongoDB Atlas cluster’s connection string and the demo_user account password. Note these both for later use.

screen_shot_2018-12-23_at_6.53.00_pm

Confluent Cloud Kafka Cluster

Similar to MongoDB Atlas, this post assumes you already have a Confluent Cloud account and an existing project. It is free to set up a Professional account and a new project if you do not already have one. Atlas account creation does require the use of a Credit Card.

The Confluent Cloud web-based management console is shown below. Experienced users of other SaaS platforms may find the Confluent Cloud web-based console a bit sparse on features. In my opinion, the console lacks some necessary features, like cluster observability, individual Kafka topic management, detailed billing history (always says $0?), and persistent history of cluster activities, which survives cluster deletion. It seems like Confluent prefers users to download and configure their Confluent Control Center to get the functionality you might normally expect from a web-based Saas management tool.

screen_shot_2018-12-23_at_6.34.18_pm

As explained earlier, for minimal latency, I suggest creating the MongoDB Atlas cluster, Confluent Cloud Kafka cluster, and the GKE cluster, all on the Google Cloud Platform’s us-central1 Region. For this demo, choose the smallest cluster size available on GCP, in the us-central1 Region, with 1 MB/s R/W throughput and 500 MB of storage. As shown below, the cost will be approximately $0.55/hour. Don’t forget to delete this cluster when you are done with the demonstration, or you will continue to be charged.

screen_shot_2018-12-23_at_6.34.56_pm

Cluster creation of the minimally-sized Confluent Cloud cluster is pretty quick.

screen_shot_2018-12-23_at_6.39.52_pmOnce the cluster is ready, Confluent provides instructions on how to interact with the cluster via the Confluent Cloud CLI. Install the Confluent Cloud CLI, locally, for use later.

screen_shot_2018-12-23_at_6.35.56_pm

As explained earlier, the Java Spring Boot storefront services use a Spring Profile, gke. Like MongoDB Atlas, the Confluent Cloud Kafka cluster configuration values will be set using Kubernetes ConfigMap and Secret resources, shown later. There are several Confluent Cloud Java configuration values shown in the Client Config Java tab; we will need these for later use.

screen_shot_2018-12-23_at_6.36.12_pm

SASL and JAAS

Some users may not be familiar with the terms, SASL and JAAS. According to Wikipedia, Simple Authentication and Security Layer (SASL) is a framework for authentication and data security in Internet protocols. According to Confluent, Kafka brokers support client authentication via SASL. SASL authentication can be enabled concurrently with SSL encryption (SSL client authentication will be disabled).

There are numerous SASL mechanisms.  The PLAIN SASL mechanism (SASL/PLAIN), used by Confluent, is a simple username/password authentication mechanism that is typically used with TLS for encryption to implement secure authentication. Kafka supports a default implementation for SASL/PLAIN which can be extended for production use. The SASL/PLAIN mechanism should only be used with SSL as a transport layer to ensure that clear passwords are not transmitted on the wire without encryption.

According to Wikipedia, Java Authentication and Authorization Service (JAAS) is the Java implementation of the standard Pluggable Authentication Module (PAM) information security framework. According to Confluent, Kafka uses the JAAS for SASL configuration. You must provide JAAS configurations for all SASL authentication mechanisms.

Cluster Authentication

Similar to MongoDB Atlas, we need to authenticate with the Confluent Cloud cluster from the storefront services. The authentication to Confluent Cloud is done with an API Key. Create a new API Key, and note the Key and Secret; these two additional pieces of configuration will be needed later.

screen_shot_2018-12-23_at_6.38.09_pm

Confluent Cloud API Keys can be created and deleted as necessary. For security in Production, API Keys should be created for each service and regularly rotated.

screen_shot_2018-12-23_at_6.38.21_pm

Kafka Topics

With the cluster created, create the storefront service’s three Kafka topics manually, using the Confluent Cloud’s ccloud CLI tool. First, configure the Confluent Cloud CLI using the ccloud init command, using your new cluster’s Bootstrap Servers address, API Key, and API Secret. The instructions are shown above Clusters Client Config tab of the Confluent Cloud web-based management interface.

screen_shot_2018-12-26_at_2.05.09_pm

Create the storefront service’s three Kafka topics using the ccloud topic create command. Use the list command to confirm they are created.

# manually create kafka topics
ccloud topic create accounts.customer.change
ccloud topic create fulfillment.order.change
ccloud topic create orders.order.fulfill
  
# list kafka topics
ccloud topic list
  
accounts.customer.change
fulfillment.order.change
orders.order.fulfill

Another useful ccloud command, topic describe, displays topic replication details. The new topics will have a replication factor of 3 and a partition count of 12.

screen_shot_2018-12-26_at_5.03.11_pm

Adding the --verbose flag to the command, ccloud --verbose topic describe, displays low-level topic and cluster configuration details, as well as a log of all topic-related activities.

screen_shot_2018-12-26_at_5.07.20_pm

Kubernetes Resources

The deployment of the three storefront microservices to the dev Namespace will minimally require the following Kubernetes configuration resources.

  • (1) Kubernetes Namespace;
  • (3) Kubernetes Deployments;
  • (3) Kubernetes Services;
  • (1) Kubernetes ConfigMap;
  • (2) Kubernetes Secrets;
  • (1) Istio 1.0 Gateway;
  • (1) Istio 1.0 VirtualService;
  • (2) Istio 1.0 ServiceEntry;

The Istio networking.istio.io v1alpha3 API introduced the last three configuration resources in the list, to control traffic routing into, within, and out of the mesh. There are a total of four new io networking.istio.io v1alpha3 API routing resources: Gateway, VirtualService, DestinationRule, and ServiceEntry.

Creating and managing such a large number of resources is a common complaint regarding the complexity of Kubernetes. Imagine the resource sprawl when you have dozens of microservices replicated across several namespaces. Fortunately, all resource files for this post are included in the storefront-kafka-docker project’s gke directory.

To follow along with the demo, you will need to make minor modifications to a few of these resources, including the Istio Gateway, Istio VirtualService, two Istio ServiceEntry resources, and two Kubernetes Secret resources.

Istio Gateway & VirtualService

Both the Istio Gateway and VirtualService configuration resources are contained in a single file, istio-gateway.yaml. For the demo, I am using a personal domain, storefront-demo.com, along with the sub-domain, api.dev, to host the Storefront API. The domain’s primary A record (‘@’) and sub-domain A record are both associated with the external IP address on the frontend of the load balancer. In the file, this host is configured for the Gateway and VirtualService resources. You can choose to replace the host with your own domain, or simply remove the host block altogether on lines 13–14 and 21–22. Removing the host blocks, you would then use the external IP address on the frontend of the load balancer (explained later in the post) to access the Storefront API (gist).

Istio ServiceEntry

There are two Istio ServiceEntry configuration resources. Both ServiceEntry resources control egress traffic from the Storefront API services, both of their ServiceEntry Location items are set to MESH_INTERNAL. The first ServiceEntry, mongodb-atlas-external-mesh.yaml, defines MongoDB Atlas cluster egress traffic from the Storefront API (gist).

The other ServiceEntry, confluent-cloud-external-mesh.yaml, defines Confluent Cloud Kafka cluster egress traffic from the Storefront API (gist).

Both need to have their host items replaced with the appropriate Atlas and Confluent URLs.

Inspecting Istio Resources

The easiest way to view Istio resources is from the command line using the istioctl and kubectl CLI tools.

istioctl get gateway
istioctl get virtualservices
istioctl get serviceentry
  
kubectl describe gateway
kubectl describe virtualservices
kubectl describe serviceentry

Multiple Namespaces

In this demo, we are only deploying to a single Kubernetes Namespace, dev. However, Istio will also support routing traffic to multiple namespaces. For example, a typical non-prod Kubernetes cluster might support devtest, and uat, each associated with a different sub-domain. One way to support multiple Namespaces with Istio 1.0 is to add each host to the Istio Gateway (lines 14–16, below), then create a separate Istio VirtualService for each Namespace. All the VirtualServices are associated with the single Gateway. In the VirtualService, each service’s host address is the fully qualified domain name (FQDN) of the service. Part of the FQDN is the Namespace, which we change for each for each VirtualService (gist).

MongoDB Atlas Secret

There is one Kubernetes Secret for the sensitive MongoDB configuration and one Secret for the sensitive Confluent Cloud configuration. The Kubernetes Secret object type is intended to hold sensitive information, such as passwords, OAuth tokens, and SSH keys.

The mongodb-atlas-secret.yaml file contains the MongoDB Atlas cluster connection string, with the demo_user username and password, one for each of the storefront service’s databases (gist).

Kubernetes Secrets are Base64 encoded. The easiest way to encode the secret values is using the Linux base64 program. The base64 program encodes and decodes Base64 data, as specified in RFC 4648. Pass each MongoDB URI string to the base64 program using echo -n.

MONGODB_URI=mongodb+srv://demo_user:your_password@your_cluster_address/accounts?retryWrites=true
echo -n $MONGODB_URI | base64

bW9uZ29kYitzcnY6Ly9kZW1vX3VzZXI6eW91cl9wYXNzd29yZEB5b3VyX2NsdXN0ZXJfYWRkcmVzcy9hY2NvdW50cz9yZXRyeVdyaXRlcz10cnVl

Repeat this process for the three MongoDB connection strings.

screen_shot_2018-12-26_at_2.15.21_pm

Confluent Cloud Secret

The confluent-cloud-kafka-secret.yaml file contains two data fields in the Secret’s data map, bootstrap.servers and sasl.jaas.config. These configuration items were both listed in the Client Config Java tab of the Confluent Cloud web-based management console, as shown previously. The sasl.jaas.config data field requires the Confluent Cloud cluster API Key and Secret you created earlier. Again, use the base64 encoding process for these two data fields (gist).

Confluent Cloud ConfigMap

The remaining five Confluent Cloud Kafka cluster configuration values are not sensitive, and therefore, may be placed in a Kubernetes ConfigMapconfluent-cloud-kafka-configmap.yaml (gist).

Accounts Deployment Resource

To see how the services consume the ConfigMap and Secret values, review the Accounts Deployment resource, shown below. Note the environment variables section, on lines 44–90, are a mix of hard-coded values and values referenced from the ConfigMap and two Secrets, shown above (gist).

Modify Microservices for Confluent Cloud

As explained earlier, Confluent Cloud’s Kafka cluster requires some very specific configuration, based largely on the security features of Confluent Cloud. Connecting to Confluent Cloud requires some minor modifications to the existing storefront service source code. The changes are identical for all three services. To understand the service’s code, I suggest reviewing the previous post, Using Eventual Consistency and Spring for Kafka to Manage a Distributed Data Model: Part 1. Note the following changes are already made to the source code in the gke git branch, and not necessary for this demo.

The previous Kafka SenderConfig and ReceiverConfig Java classes have been converted to Java interfaces. There are four new SenderConfigConfluent, SenderConfigNonConfluent, ReceiverConfigConfluent, and ReceiverConfigNonConfluent classes, which implement one of the new interfaces. The new classes contain the Spring Boot Profile class-level annotation. One set of Sender and Receiver classes are assigned the @Profile("gke") annotation, and the others, the @Profile("!gke") annotation. When the services start, one of the two class implementations are is loaded, depending on the Active Spring Profile, gke or not gke. To understand the changes better, examine the Account service’s SenderConfigConfluent.java file (gist).

Line 20: Designates this class as belonging to the gke Spring Profile.

Line 23: The class now implements an interface.

Lines 25–44: Reference the Confluent Cloud Kafka cluster configuration. The values for these variables will come from the Kubernetes ConfigMap and Secret, described previously, when the services are deployed to GKE.

Lines 55–59: Additional properties that have been added to the Kafka Sender configuration properties, specifically for Confluent Cloud.

Once code changes were completed and tested, the Docker Image for each service was rebuilt and uploaded to Docker Hub for public access. When recreating the images, the version of the Java Docker base image was upgraded from the previous post to Alpine OpenJDK 12 (openjdk:12-jdk-alpine).

Google Kubernetes Engine (GKE) with Istio

Having created the MongoDB Atlas and Confluent Cloud clusters, built the Kubernetes and Istio resources, modified the service’s source code, and pushed the new Docker Images to Docker Hub, the GKE cluster may now be built.

For the sake of brevity, we will manually create the cluster and deploy the resources, using the Google Cloud SDK gcloud and Kubernetes kubectl CLI tools, as opposed to automating with CI/CD tools, like Jenkins or Spinnaker. For this demonstration, I suggest a minimally-sized two-node GKE cluster using n1-standard-2 machine-type instances. The latest available release of Kubernetes on GKE at the time of this post was 1.11.5-gke.5 and Istio 1.03 (Istio on GKE still considered beta). Note Kubernetes and Istio are evolving rapidly, thus the configuration flags often change with newer versions. Check the GKE Clusters tab for the latest clusters create command format (gist).

Executing these commands successfully will build the cluster and the dev Namespace, into which all the resources will be deployed. The two-node cluster creation process takes about three minutes on average.

screen_shot_2018-12-26_at_2.00.56_pm

We can also observe the new GKE cluster from the GKE Clusters Details tab.

screen_shot_2018-12-26_at_2.18.32_pm

Creating the GKE cluster also creates several other GCP resources, including a TCP load balancer and three external IP addresses. Shown below in the VPC network External IP addresses tab, there is one IP address associated with each of the two GKE cluster’s VM instances, and one IP address associated with the frontend of the load balancer.

screen_shot_2018-12-26_at_2.59.38_pm

While the TCP load balancer’s frontend is associated with the external IP address, the load balancer’s backend is a target pool, containing the two GKE cluster node machine instances.

screen_shot_2018-12-26_at_2.58.42_pm

A forwarding rule associates the load balancer’s frontend IP address with the backend target pool. External requests to the frontend IP address will be routed to the GKE cluster. From there, requests will be routed by Kubernetes and Istio to the individual storefront service Pods, and through the Istio sidecar (Envoy) proxies. There is an Istio sidecar proxy deployed to each Storefront service Pod.

screen_shot_2018-12-26_at_2.59.59_pm

Below, we see the details of the load balancer’s target pool, containing the two GKE cluster’s VMs.

screen_shot_2018-12-26_at_3.57.03_pm.png

As shown at the start of the post, a simplified view of the GCP/GKE network routing looks as follows. For brevity, firewall rules and routes are not illustrated in the diagram.

ConfluentCloudRouting

Apply Kubernetes Resources

Again, using kubectl, deploy the three services and associated Kubernetes and Istio resources. Note the Istio Gateway and VirtualService(s) are not deployed to the dev Namespace since their role is to control ingress and route traffic to the dev Namespace and the services within it (gist).

Once these commands complete successfully, on the Workloads tab, we should observe two Pods of each of the three storefront service Kubernetes Deployments deployed to the dev Namespace, all six Pods with a Status of ‘OK’. A Deployment controller provides declarative updates for Pods and ReplicaSets.

screen_shot_2018-12-26_at_2.51.01_pm

On the Services tab, we should observe the three storefront service’s Kubernetes Services. A Service in Kubernetes is a REST object.

screen_shot_2018-12-26_at_2.51.16_pm

On the Configuration Tab, we should observe the Kubernetes ConfigMap and two Secrets also deployed to the dev Environment.

screen_shot_2018-12-26_at_2.51.36_pm

Below, we see the confluent-cloud-kafka ConfigMap resource with its data map of Confluent Cloud configuration.

screen_shot_2018-12-23_at_10.54.51_pm

Below, we see the confluent-cloud-kafka Secret with its data map of sensitive Confluent Cloud configuration.

screen_shot_2018-12-23_at_10.55.17_pm

Test the Storefront API

If you recall from part two of the previous post, there are a set of seven Storefront API endpoints that can be called to create sample data and test the API. The HTTP GET Requests hit each service, generate test data, populate the three MongoDB databases, and produce and consume Kafka messages across all three topics. Making these requests is the easiest way to confirm the Storefront API is working properly.

  1. Sample Customer: accounts/customers/sample
  2. Sample Orders: orders/customers/sample/orders
  3. Sample Fulfillment Requests: orders/customers/sample/fulfill
  4. Sample Processed Order Event: fulfillment/fulfillment/sample/process
  5. Sample Shipped Order Event: fulfillment/fulfillment/sample/ship
  6. Sample In-Transit Order Event: fulfillment/fulfillment/sample/in-transit
  7. Sample Received Order Event: fulfillment/fulfillment/sample/receive

Thee are a wide variety of tools to interact with the Storefront API. The project includes a simple Python script, sample_data.py, which will make HTTP GET requests to each of the above endpoints, after confirming their health, and return a success message.

screen_shot_2018-12-31_at_12.19.50_pm.png

Postman

Postman, my personal favorite, is also an excellent tool to explore the Storefront API resources. I have the above set of the HTTP GET requests saved in a Postman Collection. Using Postman, below, we see the response from an HTTP GET request to the /accounts/customers endpoint.

screen_shot_2018-12-26_at_5.48.34_pm

Postman also allows us to create integration tests and run Collections of Requests in batches using Postman’s Collection Runner. To test the Storefront API, below, I used Collection Runner to run a single series of integration tests, intended to confirm the API’s functionality, by checking for expected HTTP response codes and expected values in the response payloads. Postman also shows the response times from the Storefront API. Since this platform was not built to meet Production SLAs, measuring response times is less critical in the Development environment.

screen_shot_2018-12-26_at_5.47.57_pm

Google Stackdriver

If you recall, the GKE cluster had the Stackdriver Kubernetes option enabled, which gives us, amongst other observability features, access to all cluster, node, pod, and container logs. To confirm data is flowing to the MongoDB databases and Kafka topics, we can check the logs from any of the containers. Below we see the logs from the two Accounts Pod containers. Observe the AfterSaveListener handler firing on an onAfterSave event, which sends a CustomerChangeEvent payload to the accounts.customer.change Kafka topic, without error. These entries confirm that both Atlas and Confluent Cloud are reachable by the GKE-based workloads, and appear to be functioning properly.

screen_shot_2018-12-26_at_8.05.50_pm.png

MongoDB Atlas Collection View

Review the MongoDB Atlas Clusters Collections tab. In this Development environment, the MongoDB databases and collections are created the first time a service tries to connects to them. In Production, the databases would be created and secured in advance of deploying resources. Once the sample data requests are completed successfully, you should now observe the three Storefront API databases, each with collections of documents.

screen_shot_2018-12-26_at_4.56.25_pm

MongoDB Compass

In addition to the Atlas web-based management console, MongoDB Compass is an excellent desktop tool to explore and manage MongoDB databases. Compass is available for Mac, Linux, and Windows. One of the many great features of Compass is the ability to visualize collection schemas and interactively filter documents. Below we see the fulfillment.requests collection schema.

Screen Shot 2019-01-20 at 10.21.54 AM.png

Confluent Control Center

Confluent Control Center is a downloadable, web browser-based tool for managing and monitoring Apache Kafka, including your Confluent Cloud clusters. Confluent Control Center provides rich functionality for building and monitoring production data pipelines and streaming applications. Confluent offers a free 30-day trial of Confluent Control Center. Since the Control Center is provided at an additional fee, and I found difficult to configure for Confluent Cloud clusters based on Confluent’s documentation, I chose not to cover it in detail, for this post.

screen_shot_2018-12-23_at_10.21.41_pm

screen_shot_2018-12-23_at_10.48.49_pm

Tear Down Cluster

Delete your Confluent Cloud and MongoDB clusters using their web-based management consoles. To delete the GKE cluster and all deployed Kubernetes resources, use the cluster delete command. Also, double-check that the external IP addresses and load balancer, associated with the cluster, were also deleted as part of the cluster deletion (gist).

Conclusion

In this post, we have seen how easy it is to integrate Cloud-based DBaaS and MaaS products with the managed Kubernetes services from GCP, AWS, and Azure. As this post demonstrated, leading SaaS providers have sufficiently matured the integration capabilities of their product offerings to a point where it is now reasonable for enterprises to architect multi-vendor, single- and multi-cloud Production platforms, without re-engineering existing cloud-native applications.

In future posts, we will revisit this Storefront API example, further demonstrating how to enable HTTPS (Securing Your Istio Ingress Gateway with HTTPS) and end-user authentication (Istio End-User Authentication for Kubernetes using JSON Web Tokens (JWT) and Auth0)

All opinions expressed in this post are my own and not necessarily the views of my current or past employers or their clients.

, , , , , , , , , , , ,

4 Comments

Integrating Search Capabilities with Actions for Google Assistant, using GKE and Elasticsearch: Part 2

Voice and text-based conversational interfaces, such as chatbots, have recently seen tremendous growth in popularity. Much of this growth can be attributed to leading Cloud providers, such as Google, Amazon, and Microsoft, who now provide affordable, end-to-end development, machine learning-based training, and hosting platforms for conversational interfaces.

Cloud-based machine learning services greatly improve a conversational interface’s ability to interpret user intent with greater accuracy. However, the ability to return relevant responses to user inquiries, also requires interfaces have access to rich informational datastores, and the ability to quickly and efficiently query and analyze that data.

In this two-part post, we will enhance the capabilities of a voice and text-based conversational interface by integrating it with a search and analytics engine. By interfacing an Action for Google Assistant conversational interface with Elasticsearch, we will improve the Action’s ability to provide relevant results to the end-user. Instead of querying a traditional database for static responses to user intent, our Action will access a  Near Real-time (NRT) Elasticsearch index of searchable documents. The Action will leverage Elasticsearch’s advanced search and analytics capabilities to optimize and shape user responses, based on their intent.

Action Preview

Here is a brief YouTube video preview of the final Action for Google Assistant, integrated with Elasticsearch, running on an Apple iPhone.

Architecture

If you recall from part one of this post, the high-level architecture of our search engine-enhanced Action for Google Assistant resembles the following. Most of the components are running on Google Cloud.

Google Search Assistant Diagram GCP

Source Code

All open-sourced code for this post can be found on GitHub in two repositories, one for the Spring Boot Service and one for the Action for Google Assistant. Code samples in this post are displayed as GitHub Gists, which may not display correctly on some mobile and social media browsers. Links to gists are also provided.

Development Process

In part two of this post, we will tie everything together by creating and integrating our Action for Google Assistant:

  • Create the new Actions for Google Assistant project using the Actions on Google console;
  • Develop the Action’s Intents and Entities using the Dialogflow console;
  • Develop, deploy, and test the Cloud Function to GCP;

Let’s explore each step in more detail.

New ‘Actions on Google’ Project

With Elasticsearch running and the Spring Boot Service deployed to our GKE cluster, we can start building our Actions for Google Assistant. Using the Actions on Google web console, we first create a new Actions project.

wp-search-021

The Directory Information tab is where we define metadata about the project. This information determines how it will look in the Actions directory and is required to publish your project. The Actions directory is where users discover published Actions on the web and mobile devices.

wp-search-019

The Directory Information tab also includes sample invocations, which may be used to invoke our Actions.

wp-search-020

Actions and Intents

Our project will contain a series of related Actions. According to Google, an Action is ‘an interaction you build for the Assistant that supports a specific intent and has a corresponding fulfillment that processes the intent.’ To build our Actions, we first want to create our Intents. To do so, we will want to switch from the Actions on Google console to the Dialogflow console. Actions on Google provides a link for switching to Dialogflow in the Actions tab.

wp-search-022

We will build our Action’s Intents in Dialogflow. The term Intent, used by Dialogflow, is standard terminology across other voice-assistant platforms, such as Amazon’s Alexa and Microsoft’s Azure Bot Service and LUIS. In Dialogflow, will be building Intents — the Find Multiple Posts Intent, Find Post Intent, Find By ID Intent, and so forth.

wp-search-023

Below, we see the Find Post Intent. The Find Post Intent is responsible for handling our user’s requests for a single post about a topic, for example, ‘Find a post about Docker.’ The Intent shown below contains a fair number, but indeed not an exhaustive list, of training phrases. These represent possible ways a user might express intent when invoking the Action.

wp-search-026

Below, we see the Find Multiple Posts Intent. The Find Multiple Posts Intent is responsible for handling our user’s requests for a list of posts about a topic, for example, ‘I’m interested in Docker.’ Similar to the Find Post Intent above, the Find Multiple Posts Intent contains a list of training phrases.

wp-search-025

Dialog Model Training

According to Google, the greater the number of natural language examples in the Training Phrases section of Intents, the better the classification accuracy. Every time a user interacts with our Action, the user’s utterances are logged. Using the Training tab in the Dialogflow console, we can train our model by reviewing and approving or correcting how the Action handled the user’s utterances.

Below we see the user’s utterances, part of an interaction with the Action. We have the option to review and approve the Intent that was called to handle the utterance, re-assign it, or delete it. This helps improve our accuracy of our dialog model.

wp-search-039.png

Dialogflow Entities

Each of the highlighted words in the training phrases maps to the facts parameter, which maps to a collection of @topic Entities. Entities represent a list of intents the Action is trained to understand.  According to Google, there are three types of entities: ‘system’ (defined by Dialogflow), ‘developer’ (defined by a developer), and ‘user’ (built for each individual end-user in every request) objects. We will be creating ‘developer’ type entities for our Action’s Intents.

wp-search-037.png

Automated Expansion

We do not have to define all possible topics a user might search for, as an entity.  By enabling the Allow Automated Expansion option, an Agent will recognize values that have not been explicitly listed in the entity list. Google describes Agents as NLU (Natural Language Understanding) modules.

wp-search-042.png

Entity Synonyms

An entity may contain synonyms. Multiple synonyms are mapped to a single reference value. The reference value is the value passed to the Cloud Function by the Action. For example, take the reference value of ‘GCP.’ The user might ask Google about ‘GCP’. However, the user might also substitute the words ‘Google Cloud’ or ‘Google Cloud Platform.’ Using synonyms, if the user utters any of these three synonymous words or phrase in their intent, the reference value, ‘GCP’, is passed in the request.

But, what if the post contains the phrase, ‘Google Cloud Platform’ more frequently than, or instead of, ‘GCP’? If the acronym, ‘GCP’, is defined as the entity reference value, then it is the value passed to the function, even if you ask for ‘Google Cloud Platform’. In the use case of searching blog posts by topic, entity synonyms are not an effective search strategy.

Elasticsearch Synonyms

A better way to solve for synonyms is by using the synonyms feature of Elasticsearch. Take, for example, the topic of ‘Istio’, Istio is also considered a Service Mesh. If I ask for posts about ‘Service Mesh’, I would like to get back posts that contain the phrase ‘Service Mesh’, but also the word ‘Istio’. To accomplish this, you would define an association between ‘Istio’ and ‘Service Mesh’, as part of the Elasticsearch WordPress posts index.

wp-search-041d

Searches for ‘Istio’ against that index would return results that contain ‘Istio’ and/or contain ‘Service Mesh’; the reverse is also true. Having created and applied a custom synonyms filter to the index, we see how Elasticsearch responds to an analysis of the natural language style phrase, ‘What is a Service Mesh?’. As shown by the tokens output in Kibana’s Dev Tools Console, Elasticsearch understands that ‘service mesh’ is synonymous with ‘istio’.

wp-search-041g

If we query the same five fields as our Action, for the topic of ‘service mesh’, we get four hits for posts (indexed documents) that contain ‘service mesh’ and/or ‘istio’.

wp-search-041c

Actions on Google Integration

Another configuration item in Dialogflow that needs to be completed is the Dialogflow’s Actions on Google integration. This will integrate our Action with Google Assistant. Google currently provides more than fifteen different integrations, including Google Assistant, Slack, Facebook Messanger, Twitter, and Twilio, as shown below.

wp-search-028

To configure the Google Assistant integration, choose the Welcome Intent as our Action’s Explicit Invocation intent. Then we designate our other Intents as Implicit Invocation intents. According to Google, this Google Assistant Integration allows our Action to reach users on every device where the Google Assistant is available.

wp-search-029

Action Fulfillment

When a user’s intent is received, it is fulfilled by the Action. In the Dialogflow Fulfillment console, we see the Action has two fulfillment options, a Webhook or an inline-editable Cloud Function, edited inline. A Webhook allows us to pass information from a matched intent into a web service and get a result back from the service. Our Action’s Webhook will call our Cloud Function on GCP, using the Cloud Function’s URL endpoint (we’ll get this URL in the next section).

wp-search-030

Google Cloud Functions

Our Cloud Function, called by our Action, is written in Node.js. Our function, index.js, is divided into four sections, which are: constants and environment variables, intent handlers, helper functions, and the function’s entry point. The helper functions are part of the Helper module, contained in the helper.js file.

Constants and Environment Variables

The section, in both index.js and helper.js, defines the global constants and environment variables used within the function. Values that reference environment variables, such as SEARCH_API_HOSTNAME are defined in the .env.yaml file. All environment variables in the .env.yaml file will be set during the Cloud Function’s deployment, described later in this post. Environment variables were recently released, and are still considered beta functionality (gist).

The npm module dependencies declared in this section are defined in the dependencies section of the package.json file. Function dependencies include Actions on Google, Firebase Functions, Winston, and Request (gist).

Intent Handlers

The intent handlers in this section correspond to the intents in the Dialogflow console. Each handler responds with a SimpleResponse, BasicCard, and Suggestion Chip response types, or  Simple Response, List, and Suggestion Chip response types. These response types were covered in part one of this post. (gist).

The Welcome Intent handler handles explicit invocations of our Action. The Fallback Intent handler handles both help requests, as well as cases when Dialogflow is unable to handle the user’s request.

As described above in the Dialogflow section, the Find Post Intent handler is responsible for handling our user’s requests for a single post about a topic. For example, ‘Find a post about Docker’. To fulfill the user request, the Find Post Intent handler, calls the Helper module’s getPostByTopic function, passing the topic requested and specifying a result set size of one post with the highest relevance score higher than an arbitrary value of  1.0.

Similarly, the Find Multiple Posts Intent handler is responsible for handling our user’s requests for a list of posts about a topic; for example, ‘I’m interested in Docker’. To fulfill the user request, the Find Multiple Posts Intent handler, calls the Helper module’s getPostsByTopic function, passing the topic requested and specifying a result set size of a maximum of six posts with the highest relevance scores greater than 1.0

The Find By ID Intent handler is responsible for handling our user’s requests for a specific, unique posts ID; for example, ‘Post ID 22141’. To fulfill the user request, the Find By ID Intent handler, calls the Helper module’s getPostById function, passing the unique Post ID (gist).

Entry Point

The entry point creates a way to handle the communication with Dialogflow’s fulfillment API (gist).

Helper Functions

The helper functions are part of the Helper module, contained in the helper.js file. In addition to typical utility functions like formatting dates, there are two functions, which interface with Elasticsearch, via our Spring Boot API, getPostsByTopic and getPostById. As described above, the intent handlers call one of these functions to obtain search results from Elasticsearch.

The getPostsByTopic function handles both the Find Post Intent handler and Find Multiple Posts Intent handler, described above. The only difference in the two calls is the size of the response set, either one result or six results maximum (gist).

Both functions use the request and request-promise-native npm modules to call the Spring Boot service’s RESTful API over HTTP. However, instead of returning a callback, the request-promise-native module allows us to return a native ES6 Promise. By returning a promise, we can use async/await with our Intent handlers. Using async/await with Promises is a newer way of handling asynchronous operations in Node.js. The asynchronous programming model, using promises, is described in greater detail in my previous post, Building Serverless Actions for Google Assistant with Google Cloud Functions, Cloud Datastore, and Cloud Storage.

ThegetPostById function handles both the Find By ID Intent handler and Option Intent handler, described above. This function is similar to the getPostsByTopic function, calling a Spring Boot service’s RESTful API endpoint and passing the Post ID (gist).

Cloud Function Deployment

To deploy the Cloud Function to GCP, use the gcloud CLI with the beta version of the functions deploy command. According to Google, gcloud is a part of the Google Cloud SDK. You must download and install the SDK on your system and initialize it before you can use gcloud. Currently, Cloud Functions are only available in four regions. I have included a shell scriptdeploy-cloud-function.sh, to make this step easier. It is called using the npm run deploy function. (gist).

The creation or update of the Cloud Function can take up to two minutes. Note the output indicates the environment variables, contained in the .env.yaml file, have been deployed. The URL endpoint of the function and the function’s entry point are also both output.

wp-search-031.png

If you recall, the URL endpoint of the Cloud Function is required in the Dialogflow Fulfillment tab. The URL can be retrieved from the deployment output (shown above). The Cloud Function is now deployed and will be called by the Action when a user invokes the Action.

What is Deployed

The .gcloudignore file is created the first time you deploy a new function. Using the the .gcloudignore file, you limit the files deployed to GCP. For this post, of all the files in the project, only four files, index.js, helper.js, package.js, and the PNG file used in the Action’s responses, need to be deployed. All other project files are ear-marked in the .gcloudignore file to avoid being deployed.

wp-search-038.png

Simulation Testing and Debugging

With our Action and all its dependencies deployed and configured, we can test the Action using the Simulation console on Actions on Google. According to Google, the Action Simulation console allows us to manually test our Action by simulating a variety of Google-enabled hardware devices and their settings.

Below, in the Simulation console, we see the successful display of our Programmatic Ponderings Search Action for Google Assistant containing the expected Simple Response, List, and Suggestion Chips response types, triggered by a user’s invocation of the Action.

wp-search-035

The simulated response indicates that the Google Cloud Function was called, and it responded successfully. That also indicates the Dialogflow-based Action successfully communicated with the Cloud Function, the Cloud Function successfully communicated with the Spring Boot service instances running on Google Kubernetes Engine, and finally, the Spring Boot services successfully communicated with Elasticsearch running on Google Compute Engine.

If we had issues with the testing, the Action Simulation console also contains tabs containing the request and response objects sent to and from the Cloud Function, the audio response, a debug console, any errors, and access to the logs.

Stackdriver Logging

In the log output below, from our Cloud Function, we see our Cloud Function’s activities. These activities including information log entries, which we explicitly defined in our Cloud Function using the winston and @google-cloud/logging-winston npm modules. According to Google, the author of the module, Stackdriver Logging for Winston provides an easy to use, higher-level layer (transport) for working with Stackdriver Logging, compatible with Winston. Developing an effective logging strategy is essential to maintaining and troubleshooting your code in Development, as well as Production.

wp-search-036

Conclusion

In this two-part post, we observed how the capabilities of a voice and text-based conversational interface, such as an Action for Google Assistant, may be enhanced through integration with a search and analytics engine, such as Elasticsearch. This post barely scraped the surface of what could be achieved with such an integration. Elasticsearch, as well as other leading Lucene-based search and analytics engines, such as Apache Solr, have tremendous capabilities, which are easily integrated to machine learning-based conversational interfaces, resulting in a more powerful and a more intuitive end-user experience.

All opinions expressed in this post are my own and not necessarily the views of my current or past employers, their clients, or Google.

, , , , , , , , , , , , ,

3 Comments

Integrating Search Capabilities with Actions for Google Assistant, using GKE and Elasticsearch: Part 1

Voice and text-based conversational interfaces, such as chatbots, have recently seen tremendous growth in popularity. Much of this growth can be attributed to leading Cloud providers, such as Google, Amazon, and Microsoft, who now provide affordable, end-to-end development, machine learning-based training, and hosting platforms for conversational interfaces.

Cloud-based machine learning services greatly improve a conversational interface’s ability to interpret user intent with greater accuracy. However, the ability to return relevant responses to user inquiries, also requires interfaces have access to rich informational datastores, and the ability to quickly and efficiently query and analyze that data.

In this two-part post, we will enhance the capabilities of a voice and text-based conversational interface by integrating it with a search and analytics engine. By interfacing an Action for Google Assistant conversational interface with Elasticsearch, we will improve the Action’s ability to provide relevant results to the end-user. Instead of querying a traditional database for static responses to user intent, our Action will access a  Near Realtime (NRT) Elasticsearch index of searchable documents. The Action will leverage Elasticsearch’s advanced search and analytics capabilities to optimize and shape user responses, based on their intent.

Action Preview

Here is a brief YouTube video preview of the final Action for Google Assistant, integrated with Elasticsearch, running on an Apple iPhone.

Google Technologies

The high-level architecture of our search engine-enhanced Action for Google Assistant will look as follows.

Google Search Assistant Diagram GCP

Here is a brief overview of the key technologies we will incorporate into our architecture.

Actions on Google

According to Google, Actions on Google is the platform for developers to extend the Google Assistant. Actions on Google is a web-based platform that provides a streamlined user-experience to create, manage, and deploy Actions. We will use the Actions on Google platform to develop our Action in this post.

Dialogflow

According to Google, Dialogflow is an enterprise-grade NLU platform that makes it easy for developers to design and integrate conversational user interfaces into mobile apps, web applications, devices, and bots. Dialogflow is powered by Google’s machine learning for Natural Language Processing (NLP).

Google Cloud Functions

Google Cloud Functions are part of Google’s event-driven, serverless compute platform, part of the Google Cloud Platform (GCP). Google Cloud Functions are analogous to Amazon’s AWS Lambda and Azure Functions. Features include automatic scaling, high availability, fault tolerance, no servers to provision, manage, patch or update, and a payment model based on the function’s execution time.

Google Kubernetes Engine

Kubernetes Engine is a managed, production-ready environment, available on GCP, for deploying containerized applications. According to Google, Kubernetes Engine is a reliable, efficient, and secure way to run Kubernetes clusters in the Cloud.

Elasticsearch

Elasticsearch is a leading, distributed, RESTful search and analytics engine. Elasticsearch is a product of Elastic, the company behind the Elastic Stack, which includes Elasticsearch, Kibana, Beats, Logstash, X-Pack, and Elastic Cloud. Elasticsearch provides a distributed, multitenant-capable, full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is similar to Apache Solr in terms of features and functionality. Both Solr and Elasticsearch is based on Apache Lucene.

Other Technologies

In addition to the major technologies highlighted above, the project also relies on the following:

  • Google Container Registry – As an alternative to Docker Hub, we will store the Spring Boot API service’s Docker Image in Google Container Registry, making deployment to GKE a breeze.
  • Google Cloud Deployment Manager – Google Cloud Deployment Manager allows users to specify all the resources needed for application in a declarative format using YAML. The Elastic Stack will be deployed with Deployment Manager.
  • Google Compute Engine – Google Compute Engine delivers scalable, high-performance virtual machines (VMs) running in Google’s data centers, on their worldwide fiber network.
  • Google Stackdriver – Stackdriver aggregates metrics, logs, and events from our Cloud-based project infrastructure, for troubleshooting.  We are also integrating Stackdriver Logging for Winston into our Cloud Function for fast application feedback.
  • Google Cloud DNS – Hosts the primary project domain and subdomains for the search engine and API. Google Cloud DNS is a scalable, reliable and managed authoritative Domain Name System (DNS) service running on the same infrastructure as Google.
  • Google VPC Network FirewallFirewall rules provide fine-grain, secure access controls to our API and search engine. We will several firewall port openings to talk to the Elastic Stack.
  • Spring Boot – Pivotal’s Spring Boot project makes it easy to create stand-alone, production-grade Spring-based Java applications, such as our Spring Boot service.
  • Spring Data Elasticsearch – Pivotal Software’s Spring Data Elasticsearch project provides easy integration to Elasticsearch from our Java-based Spring Boot service.

Demonstration

To demonstrate an Action for Google Assistant with search engine integration, we need an index of content to search. In this post, we will build an informational Action, the Programmatic Ponderings Search Action, that responds to a user’s interests in certain technical topics, by returning post suggestions from the Programmatic Ponderings blog. For this demonstration, I have indexed the last two years worth of blog posts into Elasticsearch, using the ElasticPress WordPress plugin.

Source Code

All open-sourced code for this post can be found on GitHub in two repositories, one for the Spring Boot Service and one for the Action for Google Assistant. Code samples in this post are displayed as GitHub Gists, which may not display correctly on some mobile and social media browsers. Links to gists are also provided.

Development Process

This post will focus on the development and integration of the Action for Google Assistant with Elasticsearch, via a Google Cloud Function, Kubernetes Engine, and the Spring Boot API service. The post is not intended to be a general how-to on developing for Actions for Google Assistant, Google Cloud Platform, Elasticsearch, or WordPress.

Building and integrating the Action will involve the following steps:

  • Design the Action’s conversation model;
  • Provision the Elastic Stack on Google Compute Engine using Deployment Manager;
  • Create an Elasticsearch index of blog posts;
  • Provision the Kubernetes cluster on GCP with GKE;
  • Develop and deploy the Spring Boot API service to Kubernetes;

Covered in Part Two of the Post:

  • Create a new Actions project using the Actions on Google;
  • Develop the Action’s Intents using the Dialogflow;
  • Develop, deploy, and test the Cloud Function to GCP;

Let’s explore each step in more detail.

Conversational Model

The conversational model design of the Programmatic Ponderings Search Action for Google Assistant will have the option to invoke the Action in two ways, with or without intent. Below on the left, we see an example of an invocation of the Action – ‘Talk to Programmatic Ponderings’. Google Assistant then responds to the user for more information (intent) – ‘What topic are you interested in reading about?’.

sample-dialog-1.png

Below on the left, we see an invocation of the Action, which includes the intent – ‘Ask Programmatic Ponderings to find a post about Kubernetes’. Google Assistant will respond directly, both verbally and visually with the most relevant post.

sample-dialog-2

When a user requests a single result, for example, ‘Find a post about Docker’, Google Assistant will include Simple ResponseBasic Card, and Suggestion Chip response types for devices with a display. This is shown in the center, above. The user may continue to ask for additional facts or choose to cancel the Action at any time.

When a user requests multiple results, for example, ‘I’m interested in Docker’, Google Assistant will include Simple ResponseList, and Suggestion Chip response types for devices with a display. An example of a List Response is shown in the center of the previous set of screengrabs, above. The user will receive up to six results in the list, with a relevance score of 1.0 or greater. The user may choose to click on any of the post results in the list, which will initiate a new search using the post’s unique ID, as shown on the right, in the first set of screengrabs, above.

The conversational model also understands a request for help and to cancel the interaction.

GCP Account and Project

The following steps assume you have an existing GCP account and you have created a project on GCP to house the Cloud Function, GKE Cluster, and Elastic Stack on Google Compute Engine. The post also assumes that you have the latest Google Cloud SDK installed on your development machine, and have authenticated your identity from the command line (gist).

Elasticsearch on GCP

There are a number of options available to host Elasticsearch. Elastic, the company behind Elasticsearch, offers the Elasticsearch Service, a fully managed, scalable, and reliable service on AWS and GCP. AWS also offers their own managed Elasticsearch Service. I found some limitations with AWS’ Elasticsearch Service, which made integration with Spring Data Elasticsearch difficult. According to AWS, the service supports HTTP but does not support TCP transport.

For this post, we will stand up the Elastic Stack on GCP using an offering from the Google Cloud Platform Marketplace. A well-known provider of packaged applications for multiple Cloud platforms, Bitnami, offers the ELK Stack (the previous name for the Elastic Stack), running on Google Compute Engine.

wp-search-004.png

GCP Marketplace Solutions are deployed using the Google Cloud Deployment Manager.  The Bitnami ELK solution is a complete stack with all the necessary software and software-defined Cloud infrastructure to securely run Elasticsearch. You select the instance’s zone(s), machine type, boot disk size, and security and networking configurations. Using that configuration, the Deployment Manager will deploy the solution and provide you with information and credentials for accessing the Elastic Stack. For this demo, we will configure a minimally-sized, single VM instance to run the Elastic Stack.

wp-search-005.png

Below we see the Bitnami ELK stack’s components being created on GCP, by the Deployment Manager.

wp-search-006.png

Indexed Content

With the Elastic Stack fully provisioned, I then configured WordPress to index the last two years of the Programmatic Pondering blog posts to Elasticsearch on GCP. If you want to follow along with this post and content to index, there is plenty of open source and public domain indexable content available on the Internet – books, movie lists, government and weather data, online catalogs of products, and so forth. Anything in a document database is directly indexable in Elasticsearch. Elastic even provides a set of index samples, available on their GitHub site.

wp-search-009

Firewall Ports for Elasticseach

The Deployment Manager opens up firewall ports 80 and 443. To index the WordPress posts, I also had to open port 9200. According to Elastic, Elasticsearch uses port 9200 for communicating with their RESTful API with JSON over HTTP. For security, I locked down this firewall opening to my WordPress server’s address as the source. (gist).

The two existing firewall rules for port opening 80 and 443 should also be locked down to your own IP address as the source. Common Elasticsearch ports are constantly scanned by Hackers, who will quickly hijack your Elasticsearch contents and hold them for ransom, in addition to deleting your indexes. Similar tactics are used on well-known and unprotected ports for many platforms, including Redis, MySQL, PostgreSQL, MongoDB, and Microsoft SQL Server.

Kibana

Once the posts are indexed, the best way to view the resulting Elasticsearch documents is through Kibana, which is included as part of the Bitnami solution. Below we see approximately thirty posts, spread out across two years.

wp-search-010.png

Each Elasticsearch document, representing an indexed WordPress blog post, contains over 125 fields of information. Fields include a unique post ID, post title, content, publish date, excerpt, author, URL, and so forth. All these fields are exposed through Elasticsearch’s API, and as we will see,  will be available to our Spring Boot service to query.

wp-search-011.png

Spring Boot Service

To ensure decoupling between the Action for Google Assistant and Elasticsearch, we will expose a RESTful search API, written in Java using Spring Boot and Spring Data Elasticsearch. The API will expose a tailored set of flexible endpoints to the Action. Google’s machine learning services will ensure our conversational model is trained to understand user intent. The API’s query algorithm and Elasticsearch’s rich Lucene-based search features will ensure the most relevant results are returned. We will host the Spring Boot service on Google Kubernetes Engine (GKE).

Will use a Spring Rest Controller to expose our RESTful web service’s resources to our Action’s Cloud Function. The current Spring Boot service contains five /elastic resource endpoints exposed by the ElasticsearchPostController class . Of those five, two endpoints will be called by our Action in this demo, the /{id} and the /dismax-search endpoints. The endpoints can be seen using the Swagger UI. Our Spring Boot service implements SpringFox, which has the option to expose the Swagger interactive API UI.

wp-search-017.png

The /{id} endpoint accepts a unique post ID as a path variable in the API call and returns a single ElasticsearchPost object wrapped in a Map object, and serialized to a  JSON payload (gist).

Below we see an example response from the Spring Boot service to an API call to the /{id} endpoint, for post ID 22141. Since we are returning a single post, based on ID, the relevance score will always be 0.0 (gist).

This controller’s /{id} endpoint relies on a method exposed by the ElasticsearchPostRepository interface. The ElasticsearchPostRepository is a Spring Data Repository , which extends ElasticsearchRepository. The repository exposes the findById() method, which returns a single instance of the type, ElasticsearchPost, from Elasticsearch (gist).

The ElasticsearchPost class is annotated as an Elasticsearch Document, similar to other Spring Data Document annotations, such as Spring Data MongoDB. The ElasticsearchPost class is instantiated to hold deserialized JSON documents stored in ElasticSeach stores indexed data (gist).

Dis Max Query

The second API endpoint called by our Action is the /dismax-search endpoint. We use this endpoint to search for a particular post topic, such as ’Docker’. This type of search, as opposed to the Spring Data Repository method used by the /{id} endpoint, requires the use of an ElasticsearchTemplate. The ElasticsearchTemplate allows us to form more complex Elasticsearch queries than is possible using an ElasticsearchRepository class. Below, the /dismax-search endpoint accepts four input request parameters in the API call, which are the topic to search for, the starting point and size of the response to return, and the minimum relevance score (gist).

The logic to create and execute the ElasticsearchTemplate is handled by the ElasticsearchService class. The ElasticsearchPostController calls the ElasticsearchService. The ElasticsearchService handles querying Elasticsearch and returning a list of ElasticsearchPost objects to the ElasticsearchPostController. The dismaxSearch method, called by the /dismax-search endpoint’s method constructs the ElasticsearchTemplate instance, used to build the request to Elasticsearch’s RESTful API (gist).

To obtain the most relevant search results, we will use Elasticsearch’s Dis Max Query combined with the Match Phrase Query. Elastic describes the Dis Max Query as:

‘a query that generates the union of documents produced by its subqueries, and that scores each document with the maximum score for that document as produced by any subquery, plus a tie breaking increment for any additional matching subqueries.

In short, the Dis Max Query allows us to query and weight (boost importance) multiple indexed fields, across all documents. The Match Phrase Query analyzes the text (our topic) and creates a phrase query out of the analyzed text.

After some experimentation, I found the valid search results were returned by applying greater weighting (boost) to the post’s title and excerpt, followed by the post’s tags and categories, and finally, the actual text of the post. I also limited results to a minimum score of 1.0. Just because a word or phrase is repeated in a post, doesn’t mean it is indicative of the post’s subject matter. Setting a minimum score attempts to help ensure the requested topic is featured more prominently in the resulting post or posts. Increasing the minimum score will decrease the number of search results, but theoretically, increase their relevance (gist).

Below we see the results of a /dismax-search API call to our service, querying for posts about the topic, ’Istio’, with a minimum score of 2.0. The search resulted in a serialized JSON payload containing three ElasticsearchPost objects (gist).

Understanding Relevance Scoring

When returning search results, such as in the example above, the top result is the one with the highest score. The highest score should denote the most relevant result to the search query. According to Elastic, in their document titled, The Theory Behind Relevance Scoring, scoring is explained this way:

‘Lucene (and thus Elasticsearch) uses the Boolean model to find matching documents, and a formula called the practical scoring function to calculate relevance. This formula borrows concepts from term frequency/inverse document frequency and the vector space model but adds more-modern features like a coordination factor, field length normalization, and term or query clause boosting.’

In order to better understand this technical explanation of relevance scoring, it is much easy to see it applied to our example. Note the first search result above, Post ID 21867, has the highest score, 5.91989. Knowing that we are searching five fields (title, excerpt, tags, categories, and content), and boosting certain fields more than others, how was this score determined? Conveniently, Spring Data Elasticsearch’s SearchRequestBuilder class exposed the setExplain method. We can see this on line 12 of the dimaxQuery method, shown above. By passing a boolean value of true to the setExplain method, we are able to see the detailed scoring algorithms used by Elasticsearch for the top result, shown above (gist).

What this detail shows us is that of the five fields searched, the term ‘Istio’ was located in four of the five fields (all except ‘categories’). Using the practical scoring function described by Elasticsearch, and taking into account our boost values, we see that the post’s ‘excerpt’ field achieved the highest score of 5.9198895 (score of 1.6739764 * boost of 3.0).

Being able to view the scoring explanation helps us tune our search results. For example, according to the details, the term ‘Istio’ appeared 100 times (termFreq=100.0) in the main body of the post (the ‘content’ field). We might ask ourselves if we are giving enough relevance to the content as opposed to other fields. We might choose to increase the boost or decrease other fields with respect to the ‘content’ field, to produce higher quality search results.

Google Kubernetes Engine

With the Elastic Stack running on Google Compute Engine, and the Spring Boot API service built, we can now provision a Kubernetes cluster to run our Spring Boot service. The service will sit between our Action’s Cloud Function and Elasticsearch. We will use Google Kubernetes Engine (GKE) to manage our Kubernete cluster on GCP. A GKE cluster is a managed group of uniform VM instances for running Kubernetes. The VMs are managed by Google Compute Engine. Google Compute Engine delivers virtual machines running in Google’s data centers, on their worldwide fiber network.

A GKE cluster can be provisioned using GCP’s Cloud Console or using the Cloud SDK, Google’s command-line interface for Google Cloud Platform products and services. I prefer using the CLI, which helps enable DevOps automation through tools like Jenkins and Travis CI (gist).

Below is the command I used to provision a minimally sized three-node GKE cluster, replete with the latest available version of Kubernetes. Although a one-node cluster is sufficient for early-stage development, testing should be done on a multi-node cluster to ensure the service will operate properly with multiple instances running behind a load-balancer (gist).

Below, we see the three n1-standard-1 instance type worker nodes, one in each of three different specific geographical locations, referred to as zones. The three zones are in the us-east1 region. Multiple instances spread across multiple zones provide single-region high-availability for our Spring Boot service. With GKE, the Master Node is fully managed by Google.

wp-search-015

Building Service Image

In order to deploy our Spring Boot service, we must first build a Docker Image and make that image available to our Kubernetes cluster. For lowest latency, I’ve chosen to build and publish the image to Google Container Registry, in addition to Docker Hub. The Spring Boot service’s Docker image is built on the latest Debian-based OpenJDK 10 Slim base image, available on Docker Hub. The Spring Boot JAR file is copied into the image (gist).

To automate the build and publish processes with tools such as Jenkins or Travis CI, we will use a simple shell script. The script builds the Spring Boot service using Gradle, then builds the Docker Image containing the Spring Boot JAR file, tags and publishes the Docker image to the image repository, and finally, redeploys the Spring Boot service container to GKE using kubectl (gist).

Below we see the latest version of our Spring Boot Docker image published to the Google Cloud Registry.

wp-search-016

Deploying the Service

To deploy the Spring Boot service’s container to GKE, we will use a Kubernetes Deployment Controller. The Deployment Controller manages the Pods and ReplicaSets. As a deployment alternative, you could choose to use CoreOS’ Operator Framework to create an Operator or use Helm to create a Helm Chart. Along with the Deployment Controller, there is a ConfigMap and a Horizontal Pod Autoscaler. The ConfigMap contains environment variables that will be available to the Spring Boot service instances running in the Kubernetes Pods. Variables include the host and port of the Elasticsearch cluster on GCP and the name of the Elasticsearch index created by WordPress. These values will override any configuration values set in the service’s application.yml Java properties file.

The Deployment Controller creates a ReplicaSet with three Pods, running the Spring Boot service, one on each worker node (gist).

To properly load-balance the three Spring Boot service Pods, we will also deploy a Kubernetes Service of the Kubernetes ServiceType, LoadBalancer. According to Kubernetes, a Kubernetes Service is an abstraction which defines a logical set of Pods and a policy by which to access them (gist).

Below, we see three instances of the Spring Boot service deployed to the GKE cluster on GCP. Each Pod, containing an instance of the Spring Boot service, is in a load-balanced pool, behind our service load balancer, and exposed on port 80.

wp-search-014

Testing the API

We can test our API and ensure it is talking to Elasticsearch, and returning expected results using the Swagger UI, shown previously, or tools like Postman, shown below.

wp-search-018.png

Communication Between GKE and Elasticsearch

Similar to port 9200, which needed to be opened for indexing content over HTTP, we also need to open firewall port 9300 between the Spring Boot service on GKE and Elasticsearch. According to Elastic, Elasticsearch Java clients talk to the Elasticsearch cluster over port 9300, using the native Elasticsearch transport protocol (TCP).

Google Search Assistant Diagram WordPress Index

Again, locking this port down to the GKE cluster as the source is critical for security (gist).

Part Two

In part one we have examined the creation of the Elastic Stack, the provisioning of the GKE cluster, and the development and deployment of the Spring Boot service to Kubernetes. In part two of this post, we will tie everything together by creating and integrating our Action for Google Assistant:

  • Create the new Actions project using the Actions on Google console;
  • Develop the Action’s Intents using the Dialogflow console;
  • Develop, deploy, and test the Cloud Function to GCP;

Google Search Assistant Diagram part 2b.png

Related Posts

If you’re interested in comparing the development of an Action for Google Assistant with that of Amazon’s Alexa and Microsoft’s LUIS-enabled chatbots, in addition to this post, I would recommend the previous three posts in this conversation interface series:

All three article’s demonstrations leverage their respective Cloud platform’s machine learning-based Natural language understanding (NLU) services. All three take advantage of their respective Cloud platform’s NoSQL database and object storage services. Lastly, all three of the article’s demonstrations are written in a common language, Node.js.

All opinions expressed in this post are my own and not necessarily the views of my current or past employers, their clients, or Google.

, , , , , , , , , , , , ,

3 Comments

Deploying Spring Boot Apps to AWS with Netflix Nebula and Spinnaker: Part 2 of 2

In Part One of this post, we examined enterprise deployment tools and introduced two of Netflix’s open-source deployment tools, the Nebula Gradle plugins, and Spinnaker. In Part Two, we will deploy a production-ready Spring Boot application, the Election microservice, to multiple Amazon EC2 instances, behind an Elastic Load Balancer (ELB). We will use a fully automated DevOps workflow. The build, test, package, bake, deploy process will be handled by the Netflix Nebula Gradle Linux Packaging Plugin, Jenkins, and Spinnaker. The high-level process will involve the following steps:

  • Configure Gradle to build a production-ready fully executable application for Unix systems (executable JAR)
  • Using deb-s3 and GPG Suite, create a secure, signed APT (Debian) repository on Amazon S3
  • Using Jenkins and the Netflix Nebula plugin, build a Debian package, containing the executable JAR and configuration files
  • Using Jenkins and deb-s3, publish the package to the S3-based APT repository
  • Using Spinnaker (HashiCorp Packer under the covers), bake an Ubuntu Amazon Machine Image (AMI), replete with the executable JAR installed from the Debian package
  • Deploy an auto-scaling set of Amazon EC2 instances from the baked AMI, behind an ELB, running the Spring Boot application using both the Red/Black and Highlander deployment strategies
  • Be able to repeat the entire automated build, test, package, bake, deploy process, triggered by a new code push to GitHub

The overall build, test, package, bake, deploy process will look as follows.

DebianPackageWorkflow12.png

DevOps Architecture

Spinnaker’s modern architecture is comprised of several independent microservices. The codebase is written in Java and Groovy. It leverages the Spring Boot framework¹. Spinnaker’s configuration, startup, updates, and rollbacks are centrally managed by Halyard. Halyard provides a single point of contact for command line interaction with Spinnaker’s microservices.

Spinnaker can be installed on most private or public infrastructure, either containerized or virtualized. Spinnaker has links to a number of Quickstart installations on their website. For this demonstration, I deployed and configured Spinnaker on Azure, starting with one of the Azure Spinnaker quick-start ARM templates. The template provisions all the necessary Azure resources. For better performance, I chose upgraded the default VM to a larger Standard D4 v3, which contains 4 vCPUs and 16 GB of memory. I would recommend at least 2 vCPUs and 8 GB of memory at a minimum for Spinnaker.

Another Azure VM, in the same virtual network as the Spinnaker VM, already hosts Jenkins, SonarQube, and Nexus Repository OSS.

From Spinnaker on Azure, Debian Packages are uploaded to the APT package repository on AWS S3. Spinnaker also bakes Amazon Machine Images (AMI) on AWS. Spinnaker provisions the AWS resources, including EC2 instances, Load Balancers, Auto Scaling Groups, Launch Configurations, and Security Groups. The only resources you need on AWS to get started with Spinnaker are a VPC and Subnets. There are some minor, yet critical prerequisites for naming your VPC and Subnets.

Other external tools include GitHub for source control and Slack for notifications. I have built and managed everything from a Mac, however, all tools are platform agnostic. The Spring Boot application was developed in JetBrains IntelliJ.

Spinnaker Architecture 2.png

Source Code

All source code for this post can be found on GitHub. The project’s README file contains a list of the Election service’s endpoints.

Code samples in this post are displayed as Gists, which may not display correctly on some mobile and social media browsers. Links to gists are also provided.

APT Repository

After setting up Spinnaker on Azure, I created an APT repository on Amazon S3, using the instructions provided by Netflix, in their Code Lab, An Introduction to Spinnaker: Hello Deployment. The setup involves creating an Amazon S3 bucket to serve as an APT (Debian) repository, creating a GPG key for signing, and using deb-s3 to manage the repository. The Code Lab also uses Aptly, a great tool, which I skipped for brevity.

spin19

GPG Key

On the Mac, I used GPG Suite to create a GPG (GNU Privacy Guard or GnuPG) automatic signing key for my APT repository. The key is required by Spinnaker to verify the Debian packages in the repository, before installation.

The Ruby Gem, deb-s3, makes management of the Debian packages easy and automatable with Jenkins. Jenkins uploads the Debian packages, using a deb-s3 command, such as the following (gist). In this post, Jenkins calls the command from the shell script, upload-deb-package.sh, which is included in the GitHub project.

The Jenkins user requires access to the signing key, to build and upload the Debian packages. I created my GPG key on my Mac, securely copied the key to my Ubuntu-based Jenkins VM, and then imported the key for the Jenkins user. You could also create your key on Ubuntu, directly. Make sure you backup your private key in a secure location!

Nebula Packaging Plugin

Next, I set up a Gradle task in my build.gradle file to build my Debian packages using the Netflix Nebula Gradle Linux Packaging Plugin. Although Debian packaging tasks could become complex for larger application installations, this task for this post is pretty simple. I used many of the best-practices suggested by Spring for Production-grade deployments. The best-practices guide recommends file location, file modes, and file user and group ownership. I create the JAR as a fully executable JAR, meaning it is started like any other executable and does not have to be started with the standard java -jar command.

In the task, shown below (gist), the JAR and the external configuration file (optional) are copied to specific locations during the deployment and symlinked, as required. I used the older SysVInit system (init.d) to enable the application to automatically starts on boot. You should probably use systemctl for your services with Ubuntu 16.04.

You can use the ar (archive) command (i.e., ar -x spring-postgresql-demo_4.5.0_all.deb), to extract and inspect the structure of a Debian package. The data.tar.gz file, displayed below in Atom, shows the final package structure.

spin47.png

Base AMI

Next, I baked a base AMI for Spinnaker to use. This base AMI is used by Spinnaker to bake (re-bake) the final AMI(s) used for provisioning the EC2 instances, containing the Spring Boot Application. The Spinnaker base AMI is built from another base AMI, the official Ubuntu 16.04 LTS image. I installed the OpenJDK 8 package on the AMI, which is required to run the Java-based Election service. Lastly and critically, I added information about the location of my S3-based APT Debian package repository to the list of configured APT data sources, and the GPG key required for package verification. This information and key will be used later by Spinnaker to bake AMIʼs, using this base AMI. The set-up script, base_ubuntu_ami_setup.sh, which is included in the GitHub project.

Jenkins

This post uses a single Jenkins CI/CD pipeline. Using a Webhook, the pipeline is automatically triggered by every git push to the GitHub project. The pipeline pulls the source code, builds the application, and performs unit-tests and static code analysis with SonarQube. If the build succeeds and the tests pass, the build artifact (JAR file) is bundled into a Debian package using the Nebula Packaging plugin, uploaded to the S3 APT repository using s3-deb, and archived locally for Spinnaker to reference. Once the pipeline is completed, on success or on failure, a Slack notification is sent. The Jenkinsfile, used for this post is available in the project on Github.

Below is a traditional Jenkins view of the CI/CD pipeline, with links to unit test reports, SonarQube results, build artifacts, and GitHub source code.

spin01

Below is the same pipeline viewed using the Jenkins Blue Ocean plugin.

spin02

It is important to perform sufficient testing before building the Debian package. You donʼt want to bake an AMI and deploy EC2 instances, at a cost, before finding out the application has bugs.

spin03

Spinnaker Setup

First, I set up a new Spinnaker Slack channel and a custom bot user. Spinnaker details the Slack set up in their Notifications and Events Guide. You can configure what type of Spinnaker events trigger Slack notifications.

spin46.png

AWS Spinnaker User

Next, I added the required Spinnaker User, Policy, and Roles to AWS. Spinnaker uses this access to query and provision infrastructure on your behalf. The Spinnaker User requires Power User level access to perform all their necessary tasks. AWS IAM set up is detailed by Spinnaker in their Cloud Providers Setup for AWS. They also describe the setup of other cloud providers. You need to be reasonably familiar with AWS IAM, including the PassRole permission to set up this part. As part of the setup, you enable AWS for Spinnaker and add your AWS account using the Halyard interface.

spin45

Spinnaker Security Groups

Next, I set up two Spinnaker Security Groups, corresponding to two AWS Security Groups, one for the load balancer and one for the Election service. The load balancer security group exposes port 80, and the Election service security group exposes port 8080.

spin36

Spinnaker Load Balancer

Next, I created a Spinnaker Load Balancer, corresponding to an Amazon Classic Load Balancer. The Load Balancer will load-balance the Election service EC2 instances. Below you see a Load Balancer, balancing a pair of active EC2 instances, the result of a Red/Black deployment.

spin37

Spinnaker can currently create both AWS Classic Load Balancers as well as Application Load Balancers (ALB).

spin25

Spinnaker Pipeline

This post uses a single, basic Spinnaker Pipeline. The pipeline bakes a new AMI from the Debian package generated by the Jenkins pipeline. After a manual approval stage, Spinnaker deploys a set of EC2 instances, behind the Load Balancer, which contains the latest version of the Election service. Spinnaker finishes the pipeline by sending a Slack notification.

spin26

Jenkins Integration

The pipeline is triggered by the successful completion of the Jenkins pipeline. This is set in the Configuration stage of the pipeline. The integration with Jenkins is managed through Spinnaker’s Igor service.

spin22.png

Bake Stage

Next, in the Bake stage, Spinnaker bakes a new AMI, containing the Debian package generated by the Jenkins pipeline. The stageʼs configuration contains the package name to reference.

spin29

The stageʼs configuration also includes a reference to which Base AMI to use, to bake the new AMIs. Here I have used the AMI ID of the base Spinnaker AMI, I created previously.

spin27

Deploy Stage

Next, the Deploy stage deploys the Election service, running on EC2 instances, provisioned from the new AMI, which was baked in the last stage. To configure the Deploy stage, you define a Spinnaker Server Group. According to Spinnaker, the Server Group identifies the deployable artifact, VM image type, the number of instances, autoscaling policies, metadata, Load Balancer, and a Security Group.

spin32

The Server Group also defines the Deployment Strategy. Below, I chose the Red/Black Deployment Strategy (also referred to as Blue/Green). This strategy will disable, not terminate the active Server Group. If the new deployment fails, we can manually or automatically perform a Rollback to the previous, currently disabled Server Group.

spin11

Letʼs Start Baking!

With set up complete, letʼs kick off a git push, trigger and complete the Jenkins pipeline, and finally trigger the Spinnaker pipeline. Below we see the pipelineʼs Bake stage has been started. Spinnakerʼs UI lets us view the Bakery Details. The Bakery, provided by Spinnakerʼs Rosco service, bakes the AMIs. Rosco uses HashiCorp Packer to bake the AMIs, using standard Packer templates.

spin04

Below we see Spinnaker (Rosco/Packer) locating the Base Spinnaker AMI we configured in the Pipelineʼs Bake stage. Next, we see Spinnaker sshʼing into a new EC2 instance with a temporary keypair and Security Group and starting the Election service Debian package installation.

spin23

Continuing, we see the latest Debian package, derived from the Jenkins pipelineʼs archive, being pulled from the S3-based APT repo. The package is verified using the GPG key and then installed. Lastly, we see a new AMI is created, containing the deployed Election service, which was initially built and packaged by Jenkins. Note the AWS Resource Tags created by Spinnaker, as shown in the Bakery output.

spin24

The base Spinnaker AMI and the AMIs baked by Spinnaker are visible in the AWS Console. Note the naming conventions used by Spinnaker for the AMIs, the Source AMI used to build the new APIs, and the addition of the Tags, which we saw being applied in the Bakery output above. The use of Tags indirectly allows full traceability from the deployed EC2 instance all the way back to the original code commit to git by the Developer.

spin48.png

Red/Black Deployments

With the new AMI baked successfully, and a required manual approval, using a Manual Judgement type pipeline stage, we can now begin a Red/Black deployment to AWS.

spin07

Using the Server Group configuration in the Deploy stage, Spinnaker deploys two EC2 instances, behind the ELB.

spin08

Below, we see the successful results of the Red/Black deployment. The single Spinnaker Cluster contains two deployed Server Groups. One group, the previously active Server Group (RED), comprised of two EC2 instances, is disabled. The ‘RED’ EC2 instances are unregistered with the load balancer but still running. The new Server Group (BLACK), also comprised of two EC2 instances, is now active and registered with the Load Balancer. Spinnaker will spread EC2 instances evenly across all Availability Zones in the US East (N. Virginia) Region.

spin38

From the AWS Console, we can observe four running instances, though only two are registered with the load-balancer.

spin34

Here we see each deployed Server Group has a different Auto Scaling Group and Launch Configuration. Note the continued use of naming conventions by Spinnaker.

spin33

 There can be only one, Highlander!

Now, in the Deploy stage of the pipeline, we will switch the Server Groupʼs Strategy to Highlander. The Highlander strategy will, as you probably guessed by the name, destroy all other Server Groups in the Cluster. This is more typically used for lower environments, like Development or Test, where you are only interested in the next version of the application for testing. The Red/Black strategy is more applicable to Production, where you want the opportunity to quickly rollback to the previous deployment, if necessary.

spin12

Following a successful deployment, below, we now see the first two Server Groups have been terminated, and a third Server Group in the Cluster is active.

spin40.png

In the AWS Console, we can confirm the four previous EC2 instances have been successfully terminated as a result of the Highlander deployment strategy, and two new instances are running.

spin39

As well, the previous Auto Scaling Groups and Launch Configurations have been deleted from AWS by Spinnaker.

spin44.png

As expected, the Classic Load Balancer only contains the two most recent EC2 instances from the last Server Group deployed.

spin41

Confirming the Deployment

Using the DNS address of the load balancer, we can hit the Election service endpoints, on either of the EC2 instances. All API endpoints are listed in the Projectʼs README file. Below, from a web browser, we see the candidates resource returning candidate information, retrieved from the Electionʼs PostgreSQL RDS database Test instance.

spin42

Similarly, from Postman, we can hit the load balancer and get back election information from the elections resource, using an HTTP GET.

spin43.png

I intentionally left out a discussion of the service’s RDS database and how configuration management was handled with Spring Profiles and Spring Cloud Config. Both topics were out of scope for this post.

Conclusion

Although this was a brief, whirlwind overview of deployment tools, it shows the power of delivery tools like Spinnaker, when seamlessly combined with other tools, like Jenkins and the Nebula plugins. Together, these tools are capable of efficiently, repeatably, and securely deploying large numbers of containerized and non-containerized applications to a variety of private, public, and hybrid cloud infrastructure.

All opinions expressed in this post are my own and not necessarily the views of my current or past employers or their clients.

¹ Running Spinnaker on Compute Engine

, , , , , , , , , , , ,

1 Comment

Deploying Spring Boot Apps to AWS with Netflix Nebula and Spinnaker: Part 1 of 2

Listening to DevOps industry pundits, you might be convinced everyone is running containers in Production (or by now, serverless). Although containerization is growing at a phenomenal rate, several recent surveys¹ indicate less than 50% of enterprises are deploying containers in Production. Filter those results further with the fact, of those enterprises, only a small percentage of their total application portfolios are containerized, let alone in Production.

As a DevOps Consultant, I regularly work with corporations whose global portfolios are in the thousands of applications. Indeed, some percentage of their applications are containerized, with less running in Production. However, a majority of those applications, even those built on modern, light-weight, distributed architectures, are still being deployed to bare-metal and virtualized public cloud and private data center infrastructure, for a variety of reasons.

Enterprise Deployment

Due to the scale and complexity of application portfolios, many organizations have invested in enterprise deployment tools, either commercially available or developed in-house. The enterprise deployment tool’s primary objective is to standardize the process of securely, reliably, and repeatably packaging, publishing, and deploying both containerized and non-containerized applications to large fleets of virtual machines and bare-metal servers, across multiple, geographically dispersed data centers and cloud providers. Enterprise deployment tools are particularly common in tightly regulated and compliance-driven organizations, as well as organizations that have undertaken large amounts of M&A, resulting in vastly different application technology stacks.

Enterprise CI/CD/Release Workflow

Better-known examples of commercially available enterprise deployment tools include IBM UrbanCode Deploy (aka uDeploy), XebiaLabs XL Deploy, CA Automic Release Automation, Octopus Deploy, and Electric Cloud ElectricFlow. While commercial tools continue to gain market share³, many organizations are tightly coupled to their in-house solutions through years of use and fear of widespread process disruption, given current economic, security, compliance, and skills-gap sensitivities.

Deployment Tool Anatomy

Most Enterprise deployment tools are compatible with standard binary package types, including Debian (.deb) and Red Hat  (RPM) Package Manager (.rpm) packages for Linux, NuGet (.nupkg) packages for Windows, and Node Package Manager (.npm) and Bower for JavaScript. There are equivalent package types for other popular languages and formats, such as Go, Python, Ruby, SQL, Android, Objective-C, Swift, and Docker. Packages usually contain application metadata, a signature to ensure the integrity and/or authenticity², and a compressed payload.

Enterprise deployment tools are normally integrated with open-source packaging and publishing tools, such as Apache Maven, Apache Ivy/Ant, Gradle, NPMNuGet, BundlerPIP, and Docker.

Binary packages (and images), built with enterprise deployment tools, are typically stored in private, open-source or commercial binary (artifact) repositories, such as SpacewalkJFrog Artifactory, and Sonatype Nexus Repository. The latter two, Artifactory and Nexus, support a multitude of modern package types and repository structures, including Maven, NuGet, PyPI, NPM, Bower, Ruby Gems, CocoaPods, Puppet, Chef, and Docker.

Mature binary repositories provide many features in addition to package management, including role-based access control, vulnerability scanning, rich APIs, DevOps integration, and fault-tolerant, high-availability architectures.

Lastly, enterprise deployment tools generally rely on standard package management systems to retrieve and install cryptographically verifiable packages and images. These include YUM (Yellowdog Updater, Modified), APT (aptitude), APK (Alpine Linux), NuGet, Chocolatey, NPM, PIP, Bundler, and Docker. Packages are deployed directly to running infrastructure, or indirectly to intermediate deployable components as Amazon Machine Images (AMI), Google Compute Engine machine images, VMware machines, Docker Images, or CoreOS rkt.

Open-Source Alternative

One such enterprise with an extensive portfolio of both containerized and non-containerized applications is Netflix. To standardize their deployments to multiple types of cloud infrastructure, Netflix has developed several well-known open-source software (OSS) tools, including the Nebula Gradle plugins and Spinnaker. I discussed Spinnaker in my previous post, Managing Applications Across Multiple Kubernetes Environments with Istio, as an alternative to Jenkins for deploying container workloads to Kubernetes on Google (GKE).

As a leader in OSS, Netflix has documented their deployment process in several articles and presentations, including a post from 2016, ‘How We Build Code at Netflix.’ According to the article, the high-level process for deployment to Amazon EC2 instances involves the following steps:

  • Code is built and tested locally using Nebula
  • Changes are committed to a central git repository
  • Jenkins job executes Nebula, which builds, tests, and packages the application for deployment
  • Builds are “baked” into Amazon Machine Images (using Spinnaker)
  • Spinnaker pipelines are used to deploy and promote the code change

The Nebula plugins and Spinnaker leverage many underlying, open-source technologies, including Pivotal Spring, Java, Groovy, Gradle, Maven, Apache Commons, Redline RPM, HashiCorp Packer, Redis, HashiCorp Consul, Cassandra, and Apache Thrift.

Both the Nebula plugins and Spinnaker have been battle tested in Production by Netflix, as well as by many other industry leaders after Netflix open-sourced the tools in 2014 (Nebula) and 2015 (Spinnaker). Currently, there are approximately 20 Nebula Gradle plugins available on GitHub. Notable core-contributors in the development of Spinnaker include Google, Microsoft, Pivotal, Target, Veritas, and Oracle, to name a few. A sign of its success, Spinnaker currently has over 4,600 Stars on GitHub!

Part Two: Demonstration

In Part Two, we will deploy a production-ready Spring Boot application, the Election microservice, to multiple Amazon EC2 instances, behind an Elastic Load Balancer (ELB). We will use a fully automated DevOps workflow. The build, test, package, bake, deploy process will be handled by the Netflix Nebula Gradle Linux Packaging Plugin, Jenkins, and Spinnaker. The high-level process will involve the following steps:

  • Configure Gradle to build a production-ready fully executable application for Unix systems (executable JAR)
  • Using deb-s3 and GPG Suite, create a secure, signed APT (Debian) repository on Amazon S3
  • Using Jenkins and the Netflix Nebula plugin, build a Debian package, containing the executable JAR and configuration files
  • Using Jenkins and deb-s3, publish the package to the S3-based APT repository
  • Using Spinnaker (HashiCorp Packer under the covers), bake an Ubuntu Amazon Machine Image (AMI), replete with the executable JAR installed from the Debian package
  • Deploy an auto-scaling set of Amazon EC2 instances from the baked AMI, behind an ELB, running the Spring Boot application using both the Red/Black and Highlander deployment strategies
  • Be able to repeat the entire automated build, test, package, bake, deploy process, triggered by a new code push to GitHub

The overall build, test, package, bake, deploy process will look as follows.

DebianPackageWorkflow12

References

 

All opinions expressed in this post are my own and not necessarily the views of my current or past employers or their clients.

¹ Recent Surveys: ForresterPortworx,  Cloud Foundry Survey
² Courtesy Wikipedia – rpm
³ XebiaLabs Kicks Off 2017 with Triple-Digit Growth in Enterprise DevOps

, , , , , , , , , , , ,

1 Comment

Developing Cloud-Native Data-Centric Spring Boot Applications for Pivotal Cloud Foundry

In this post, we will explore the development of a cloud-native, data-centric Spring Boot 2.0 application, and its deployment to Pivotal Software’s hosted Pivotal Cloud Foundry service, Pivotal Web Services. We will add a few additional features, such as Spring Data, Lombok, and Swagger, to enhance our application.

According to Pivotal, Spring Boot makes it easy to create stand-alone, production-grade Spring-based Applications. Spring Boot takes an opinionated view of the Spring platform and third-party libraries. Spring Boot 2.0 just went GA on March 1, 2018. This is the first major revision of Spring Boot since 1.0 was released almost 4 years ago. It is also the first GA version of Spring Boot that provides support for Spring Framework 5.0.

Pivotal Web Services’ tagline is ‘The Agile Platform for the Agile Team Powered by Cloud Foundry’. According to Pivotal,  Pivotal Web Services (PWS) is a hosted environment of Pivotal Cloud Foundry (PCF). PWS is hosted on AWS in the US-East region. PWS utilizes two availability zones for redundancy. PWS provides developers a Spring-centric PaaS alternative to AWS Elastic Beanstalk, Elastic Container Service (Amazon ECS), and OpsWorks. With PWS, you get the reliability and security of AWS, combined with the rich-functionality and ease-of-use of PCF.

To demonstrate the feature-rich capabilities of the Spring ecosystem, the Spring Boot application shown in this post incorporates the following complimentary technologies:

  • Spring Boot Actuator: Sub-project of Spring Boot, adds several production grade services to Spring Boot applications with little developer effort
  • Spring Data JPA: Sub-project of Spring Data, easily implement JPA based repositories and data access layers
  • Spring Data REST: Sub-project of Spring Data, easily build hypermedia-driven REST web services on top of Spring Data repositories
  • Spring HATEOAS: Create REST representations that follow the HATEOAS principle from Spring-based applications
  • Springfox Swagger 2: We are using the Springfox implementation of the Swagger 2 specification, an automated JSON API documentation for API’s built with Spring
  • Lombok: The @Data annotation generates boilerplate code that is typically associated with simple POJOs (Plain Old Java Objects) and beans: @ToString, @EqualsAndHashCode, @Getter, @Setter, and @RequiredArgsConstructor

Source Code

All source code for this post can be found on GitHub. To get started quickly, use one of the two following commands (gist).

For this post, I have used JetBrains IntelliJ IDEA and Git Bash on Windows for development. However, all code should be compatible with most popular IDEs and development platforms. The project assumes you have Docker and the Cloud Foundry Command Line Interface (cf CLI) installed locally.

Code samples in this post are displayed as Gists, which may not display correctly on some mobile and social media browsers. Links to gists are also provided.

Demo Application

The Spring Boot application demonstrated in this post is a simple election-themed RESTful API. The app allows API consumers to create, read, update, and delete, candidates, elections, and votes, via its exposed RESTful HTTP-based resources.

The Spring Boot application consists of (7) JPA Entities that mirror the tables and views in the database, (7) corresponding Spring Data Repositories, (2) Spring REST Controller, (4) Liquibase change sets, and associated Spring, Liquibase, Swagger, and PCF configuration files. I have intentionally chosen to avoid the complexities of using Data Transfer Objects (DTOs) for brevity, albeit a security concern, and directly expose the entities as resources.

img022_Final_Project

Controller Resources

This application is a simple CRUD application. The application contains a few simple HTTP GET resources in each of the two controller classes, as an introduction to Spring REST controllers. For example, the CandidateController contains the /candidates/summary and /candidates/summary/{election} resources (shown below in Postman). Typically, you would expose your data to the end-user as controller resources, as opposed to exposing the entities directly. The ease of defining controller resources is one of the many powers of Spring Boot.

img025_CustomResource.PNG

Paging and Sorting

As an introduction to Spring Data’s paging and sorting features, both the VoteRepository and VotesElectionsViewRepository Repository Interfaces extend Spring Data’s PagingAndSortingRepository<T,ID> interface, instead of the default CrudRepository<T,ID> interface. With paging and sorting enabled, you may both sort and limit the amount of data returned in the response payload. For example, to reduce the size of your response payload, you might choose to page through the votes in blocks of 25 votes at a time. In that case, as shown below in Postman, if you needed to return just votes 26-50, you would append the /votes resource with ?page=1&size=25. Since paging starts on page 0 (zero), votes 26-50 will on page 1.

img024_Paging

Swagger

This project also includes the Springfox implementation of the Swagger 2 specification. Along with the Swagger 2 dependency, the project takes a dependency on io.springfox:springfox-swagger-ui. The Springfox Swagger UI dependency allows us to view and interactively test our controller resources through Swagger’s browser-based UI, as shown below.

img027B_Swagger

All Swagger configuration can be found in the project’s SwaggerConfig Spring Configuration class.

Gradle

This post’s Spring Boot application is built with Gradle, although it could easily be converted to Maven if desired. According to Gradle, Gradle is the modern tool used to build, automate and deliver everything from mobile apps to microservices.

Data

In real life, most applications interact with one or more data sources. The Spring Boot application demonstrated in this post interacts with data from a PostgreSQL database. PostgreSQL, or simply Postgres, is the powerful, open-source object-relational database system, which has supported production-grade applications for 15 years. The application’s single elections database consists of (6) tables, (3) views, and (2) function, which are both used to generate random votes for this demonstration.

img020_Database_Diagram

Spring Data makes interacting with PostgreSQL easy. In addition to the features of Spring Data, we will use Liquibase. Liquibase is known as the source control for your database. With Liquibase, your database development lifecycle can mirror your Spring development lifecycle. Both DDL (Data Definition Language) and DML (Data Manipulation Language) changes are versioned controlled, alongside the Spring Boot application source code.

Locally, we will take advantage of Docker to host our development PostgreSQL database, using the official PostgreSQL Docker image. With Docker, there is no messy database installation and configuration of your local development environment. Recreating and deleting your PostgreSQL database is simple.

To support the data-tier in our hosted PWS environment, we will implement ElephantSQL, an offering from the Pivotal Services Marketplace. ElephantSQL is a hosted version of PostgreSQL, running on AWS. ElephantSQL is referred to as PostgreSQL as a Service, or more generally, a Database as a Service or DBaaS. As a Pivotal Marketplace service, we will get easy integration with our PWS-hosted Spring Boot application, with near-zero configuration.

Docker

First, set up your local PostgreSQL database using Docker and the official PostgreSQL Docker image. Since this is only a local database instance, we will not worry about securing our database credentials (gist).

Your running PostgreSQL container should resemble the output shown below.

img001_docker

Data Source

Most IDEs allow you to create and save data sources. Although this is not a requirement, it makes it easier to view the database’s resources and table data. Below, I have created a data source in IntelliJ from the running PostgreSQL container instance. The port, username, password, and database name were all taken from the above Docker command.

img002_IntelliJ_Data_Source

Liquibase

There are multiple strategies when it comes to managing changes to your database. With Liquibase, each set of changes are handled as change sets. Liquibase describes a change set as an atomic change that you want to apply to your database. Liquibase offers multiple formats for change set files, including XML, JSON, YAML, and SQL. For this post, I have chosen SQL, specifically PostgreSQL SQL dialect, which can be designated in the IntelliJ IDE. Below is an example of the first changeset, which creates four tables and two indexes.

img023_Change_Set

As shown below, change sets are stored in the db/changelog/changes sub-directory, as configured in the master change log file (db.changelog-master.yaml). Change set files follow an incremental naming convention.

img003C_IntelliJ_Liquibase_Changesets

The empty PostgreSQL database, before any Liquibase changes, should resemble the screengrab shown below.

img003_IntelliJ_Blank_Database_cropped

To automatically run Liquibase database migrations on startup, the org.liquibase:liquibase-core dependency must be added to the project’s build.gradle file. To apply the change sets to your local, empty PostgreSQL database, simply start the service locally with the gradle bootRun command. As the app starts after being compiled, any new Liquibase change sets will be applied.

img004_Gradle_bootRun

You might ask how does Liquibase know the change sets are new. During the initial startup of the Spring Boot application, in addition to any initial change sets, Liquibase creates two database tables to track changes, the databasechangelog and databasechangeloglock tables. Shown below are the two tables, along with the results of the four change sets included in the project, and applied by Liquibase to the local PostgreSQL elections database.

img005_IntelliJ_Initial_Database_cropped

Below we see the contents of the databasechangelog table, indicating that all four change sets were successfully applied to the database. Liquibase checks this table before applying change sets.

img006B_IntelliJ_Database_Change_Log

ElephantSQL

Before we can deploy our Spring Boot application to PWS, we need an accessible PostgreSQL instance in the Cloud; I have chosen ElephantSQL. Available through the Pivotal Services Marketplace, ElephantSQL currently offers one free and three paid service plans for their PostgreSQL as a Service. I purchased the Panda service plan as opposed to the free Turtle service plan. I found the free service plan was too limited in the maximum number of database connections for multiple service instances.

Previewing and purchasing an ElephantSQL service plan from the Pivotal Services Marketplace, assuming you have an existing PWS account, literally takes a single command (gist).

The output of the command should resemble the screengrab below. Note the total concurrent connections and total storage for each plan.

img007_PCF_ElephantSQL_Service_Purchase

To get details about the running ElephantSQL service, use the cf service elections command.

img007_PCF_ElephantSQL_Service_Info

From the ElephantSQL Console, we can obtain the connection information required to access our PostgreSQL elections database. You will need the default database name, username, password, and URL.

img012_PWS_ElephantSQL_Details

Service Binding

Once you have created the PostgreSQL database service, you need to bind the database service to the application. We will bind our application and the database, using the PCF deployment manifest file (manifest.yml), found in the project’s root directory. Binding is done using the services section (shown below).

The key/value pairs in the env section of the deployment manifest will become environment variables, local to the deployed Spring Boot service. These key/value pairs in the manifest will also override any configuration set in Spring’s external application properties file (application.yml). This file is located in the resources sub-directory. Note the SPRING_PROFILES_ACTIVE: test environment variable in the manifest.yml file. This variable designates which Spring Profile will be active from the multiple profiles defined in the application.yml file.

img008B_PCF_Manifest

Deployment to PWS

Next, we run gradle build followed by cf push to deploy one instance of the Spring Boot service to PWS and associate it with our ElephantSQL database instance. Below is the expected output from the cf push command.

img008_PCF_CF_Push

Note the route highlighted below. This is the URL where your Spring Boot service will be available.

img009_PCF_CF_Push2

To confirm your ElephantSQL database was populated by Liquibase when PWS started the deployed Spring application instance, we can check the ElephantSQL Console’s Stats tab. Note the database tables and rows in each table, signifying Liquibase ran successfully. Alternately, you could create another data source in your IDE, connected to ElephantSQL; this can be helpful for troubleshooting.

img013_Candidates

To access the running service and check that data is being returned, point your browser (or Postman) to the URL returned from the cf push command output (see second screengrab above) and hit the /candidates resource. Obviously, your URL, referred to as a route by PWS, will be different and unique. In the response payload, you should observe a JSON array of eight candidate objects. Each candidate was inserted into the Candidate table of the database, by Liquibase, when Liquibase executed the second of the four change sets on start-up.

img012_PWS_ElephantSQL

With Spring Boot Actuator and Spring Data REST, our simple Spring Boot application has dozens of resources exposed automatically, without extensive coding of resource controllers. Actuator exposes resources to help manage and troubleshoot the application, such as info, health, mappings (shown below), metrics, env, and configprops, among others. All Actuator resources are exposed explicitly, thus they can be disabled for Production deployments. With Spring Boot 2.0, all Actuator resources are now preceded with /actuator/ .

img029_Postman_Mappings

According to Pivotal, Spring Data REST builds on top of Spring Data repositories, analyzes an application’s domain model and exposes hypermedia-driven HTTP resources for aggregates contained in the model, such as our /candidates resource. A partial list of the application’s exposed resources are listed in the GitHub project’s README file.

In Spring’s approach to building RESTful web services, HTTP requests are handled by a controller. Spring Data REST automatically exposes CRUD resources for our entities. With Spring Data JPA, POJOs like our Candidate class are annotated with @Entity, indicating that it is a JPA entity. Lacking a @Table annotation, it is assumed that this entity will be mapped to a table named Candidate.

With Spring’s Data REST’s RESTful HTTP-based API, traditional database Create, Read, Update, and Delete commands for each PostgreSQL database table are automatically mapped to equivalent HTTP methods, including POST, GET, PUT, PATCH, and DELETE.

Below is an example, using Postman, to create a new Candidate using an HTTP POST method.

img029_Postman_Post

Below is an example, using Postman, to update a new Candidate using an HTTP PUT method.

img029_Postman_Put.PNG

With Spring Data REST, we can even retrieve data from read-only database Views, as shown below. This particular JSON response payload was returned from the candidates_by_elections database View, using the /election-candidates resource.

img028_Postman_View.PNG

Scaling Up

Once your application is deployed and you have tested its functionality, you can easily scale out or scale in the number instances, memory, and disk, with the cf scale command (gist).

Below is sample output from scaling up the Spring Boot application to two instances.

img016_Scale_Up2

Optionally, you can activate auto-scaling, which will allow the application to scale based on load.

img016_Autoscaling.PNG

Following the PCF architectural model, auto-scaling is actually another service from the Pivotal Services Marketplace, PCF App Autoscaler, as seen below, running alongside our ElephantSQL service.

img016_Autoscaling2.PNG

With PCF App Autoscaler, you set auto-scaling minimum and maximum instance limits and define scaling rules. Below, I have configured auto-scaling to scale out the number of application instances when the average CPU Utilization of all instances hits 80%. Conversely, the application will scale in when the average CPU Utilization recedes below 40%. In addition to CPU Utilization, PCF App Autoscaler also allows you to set scaling rules based on HTTP Throughput, HTTP Latency, RabbitMQ Depth (queue depth), and Memory Utilization.

Furthermore, I set the auto-scaling minimum number of instances to two and the maximum number of instances to four. No matter how much load is placed on the application, PWS will not scale above four instances. Conversely, PWS will maintain a minimum of two running instances at all times.

img016_Autoscaling3

Conclusion

This brief post demonstrates both the power and simplicity of Spring Boot to quickly develop highly-functional, data-centric RESTful API applications with minimal coding. Further, when coupled with Pivotal Cloud Foundry, Spring developers have a highly scalable, resilient cloud-native application hosting platform.

, , , , , , , , , , , ,

Leave a comment

Eventual Consistency: Decoupling Microservices with Spring AMQP and RabbitMQ

RabbitMQEnventCons.png

Introduction

In a recent post, Decoupling Microservices using Message-based RPC IPC, with Spring, RabbitMQ, and AMPQ, we moved away from synchronous REST HTTP for inter-process communications (IPC) toward message-based IPC. Moving to asynchronous message-based communications allows us to decouple services from one another. It makes it easier to build, test, and release our individual services. In that post, we did not achieve fully asynchronous communications. Although, we did achieve a higher level of service decoupling using message-based Remote Procedure Call (RPC) IPC.

In this post, we will fully decouple our services using the distributed computing model of eventual consistency. More specifically, we will use a message-based, event-driven, loosely-coupled, eventually consistent architectural approach for communications between services.

What is eventual consistency? One of the best definitions of eventual consistency I have read was posted on microservices.io. To paraphrase, ‘using an event-driven, eventually consistent approach, each service publishes an event whenever it updates its data. Other services subscribe to events. When an event is received, a service updates its data.

Example of Eventual Consistency

Imagine, Service A, the Customer service, inserts a new customer record into its database. Based on that ‘customer created’ event, Service A publishes a message containing the new customer object, serialized to JSON, to the lightweight, persistent, New Customer message queue.

Service B, the Customer Onboarding service, a subscriber to the New Customer queue, consumes and deserializes Service A’s message. Service B may or may not perform a data transformation of the Customer object to its own Customer data model. Service B then inserts the new customer record into its own database.

In the above example, it can be said that the customer records in Service B’s database are eventually consistent with the customer records in Service A’s database. Service A makes a change and publishes a message in response to the event. Service B consumes the message and makes the same change. Eventually (likely within milliseconds), Service B’s customer records are consistent with Service A’s customer records.

Why Eventual Consistency?

So what does this apparent added complexity and duplication of data buy us? Consider the advantages. Service B, the Onboarding service, requires no knowledge of, or a dependency on, Service A, the Customer service. Still, Service B has a current record of all the customers that Service A maintains. Instead of making repeated and potentially costly RESTful HTTP calls or RPC message-based calls to or from Service A to Service B for new customers, Service B queries its database for a list of customers.

The value of eventual consistency increases factorially as you scale a distributed system. Imagine dozens of distinct microservices, many requiring data from other microservices. Further, imagine multiple instances of each of those services all running in parallel. Decoupling services from one another, through asynchronous forms of IPC, messaging, and event-driven eventual consistency greatly simplifies the software development lifecycle and operations.

Demonstration

In this post, we could use a few different architectural patterns to demonstrate message passing with RabbitMQ and Spring AMQP. They including Work Queues, Publish/Subscribe, Routing, or Topics. To keep things as simple as possible, we will have a single Producer, publish messages to a single durable and persistent message queue. We will have a single Subscriber, a Consumer, consume the messages from that queue. We focus on a single type of event message.

Sample Code

To demonstrate Spring AMQP-based messaging with RabbitMQ, we will use a reference set of three Spring Boot microservices. The Election ServiceCandidate Service, and Voter Service are all backed by MongoDB. The services and MongoDB, along with RabbitMQ and Voter API Gateway, are all part of the Voter API.

The Voter API Gateway, based on HAProxy, serves as a common entry point to all three services, as well as serving as a reverse proxy and load balancer. The API Gateway provides round-robin load-balanced access to multiple instances of each service.

Voter_API_Architecture

All the source code found this post’s example is available on GitHub, within a few different project repositories. The Voter Service repository contains the Voter service source code, along with the scripts and Docker Compose files required to deploy the project. The Election Service repository, Candidate Service repository, and Voter API Gateway repository are also available on GitHub. There is also a new AngularJS/Node.js Web Client, to demonstrate how to use the Voter API.

For this post, you only need to clone the Voter Service repository.

Deploying Voter API

All components, including the Spring Boot services, MongoDB, RabbitMQ, API Gateway, and the Web Client, are individually deployed using Docker. Each component is publicly available as a Docker Image, on Docker Hub. The Voter Service repository contains scripts to deploy the entire set of Dockerized components, locally. The repository also contains optional scripts to provision a Docker Swarm, using Docker’s newer swarm mode, and deploy the components. We will only deploy the services locally for this post.

To clone and deploy the components locally, including the Spring Boot services, MongoDB, RabbitMQ, and the API Gateway, execute the following commands. If this is your first time running the commands, it may take a few minutes for your system to download all the required Docker Images from Docker Hub.

If everything was deployed successfully, you should observe six running Docker containers, similar to the output, below.

Using Voter API

The Voter Service, Election Service, and Candidate Service GitHub repositories each contain README files, which detail all the API endpoints each service exposes, and how to call them.

In addition to casting votes for candidates, the Voter service can simulate election results. Calling the /simulation endpoint, and indicating the desired election, the Voter service will randomly generate a number of votes for each candidate in that election. This will save us the burden of casting votes for this demonstration. However, the Voter service has no knowledge of elections or candidates. The Voter service depends on the Candidate service to obtain a list of candidates.

The Candidate service manages electoral candidates, their political affiliation, and the election in which they are running. Like the Voter service, the Candidate service also has a /simulation endpoint. The service will create a list of candidates based on the 2012 and 2016 US Presidential Elections. The simulation capability of the service saves us the burden of inputting candidates for this demonstration.

The Election service manages elections, their polling dates, and the type of election (federal, state, or local). Like the other services, the Election service also has a /simulation endpoint, which will create a list of sample elections. The Election service will not be discussed in this post’s demonstration. We will examine communications between the Candidate and Voter services, only.

REST HTTP Endpoint

As you recall from our previous post, Decoupling Microservices using Message-based RPC IPC, with Spring, RabbitMQ, and AMPQ, the Voter service exposes multiple, almost identical endpoints. Each endpoint uses a different means of IPC to retrieve candidates and generates random votes.

Calling the /voter/simulation/http/{election} endpoint and providing a specific election, prompts the Voter service to request a list of candidates from the Candidate service, based on the election parameter you input. This request is done using synchronous REST HTTP. The Voter service uses the HTTP GET method to request the data from the Candidate service. The Voter service then waits for a response.

The Candidate service receives the HTTP request. The Candidate service responds to the Voter service with a list of candidates in JSON format. The Voter service receives the response payload containing the list of candidates. The Voter service then proceeds to generate a random number of votes for each candidate in the list. Finally, each new vote object (MongoDB document) is written back to the vote collection in the Voter service’s voters  database.

Message-based RPC Endpoint

Similarly, calling the /voter/simulation/rpc/{election} endpoint and providing a specific election, prompts the Voter service to request the same list of candidates. However, this time, the Voter service (the client) produces a request message and places in RabbitMQ’s voter.rpc.requests queue. The Voter service then waits for a response. The Voter service has no direct dependency on the Candidate service; it only depends on a response to its request message. In this way, it is still a form of synchronous IPC, but the Voter service is now decoupled from the Candidate service.

The request message is consumed by the Candidate service (the server), who is listening to that queue. In response, the Candidate service produces a message containing the list of candidates serialized to JSON. The Candidate service (the server) sends a response back to the Voter service (the client) through RabbitMQ. This is done using the Direct reply-to feature of RabbitMQ or using a unique response queue, specified in the reply-to header of the request message, sent by the Voter Service.

The Voter service receives the message containing the list of candidates. The Voter service deserializes the JSON payload to candidate objects. The Voter service then proceeds to generate a random number of votes for each candidate in the list. Finally, identical to the previous example, each new vote object (MongoDB document) is written back to the vote collection in the Voter service’s voters database.

New Endpoint

Calling the new /voter/simulation/db/{election} endpoint and providing a specific election, prompts the Voter service to query its own MongoDB database for a list of candidates.

But wait, where did the candidates come from? The Voter service didn’t call the Candidate service? The answer is message-based eventual consistency. Whenever a new candidate is created, using a REST HTTP POST request to the Candidate service’s /candidate/candidates endpoint, a Spring Data Rest Repository Event Handler responds. Responding to the candidate created event, the event handler publishes a message, containing a serialized JSON representation of the new candidate object, to a durable and persistent RabbitMQ queue.

The Voter service is listening to that queue. The Voter service consumes messages off the queue, deserializes the candidate object, and saves it to its own voters database, to the candidate collection. For this example, we are saving the incoming candidate object as is, with no transformations. The candidate object model for both services is identical.

When /voter/simulation/db/{election} endpoint is called, the Voter service queries its voters database for a list of candidates. They Voter service then proceeds to generate a random number of votes for each candidate in the list. Finally, identical to the previous two examples, each new vote object (MongoDB document) is written back to the vote collection in the Voter service’s voters  database.

Message_Queue_Diagram_Final3B

Exploring the Code

We will not review the REST HTTP or RPC IPC code in this post. It was covered in detail, in the previous post. Instead, we will explore the new code required for eventual consistency.

Spring Dependencies

To use AMQP with RabbitMQ, we need to add a project dependency on org.springframework.boot.spring-boot-starter-amqp. Below is a snippet from the Candidate service’s build.gradle file, showing project dependencies. The Voter service’s dependencies are identical.

AMQP Configuration

Next, we need to add a small amount of RabbitMQ AMQP configuration to both services. We accomplish this by using Spring’s @Configuration annotation on our configuration classes. Below is the abridged configuration class for the Voter service.

And here, the abridged configuration class for the Candidate service.

Event Handler

With our dependencies and configuration in place, we will define the CandidateEventHandler class. This class is annotated with the Spring Data Rest @RepositoryEventHandler and Spring’s @Component. The @Component annotation ensures the event handler is registered.

The class contains the handleCandidateSave method, which is annotated with the Spring Data Rest @HandleAfterCreate. The event handler acts on the Candidate object, which is the first parameter in the method signature.

Responding to the candidate created event, the event handler publishes a message, containing a serialized JSON representation of the new candidate object, to the candidates.queue queue. This was the queue we configured earlier.

Consuming Messages

Next, we let’s switch to the Voter service’s CandidateListService class. Below is an abridged version of the class with two new methods. First, the getCandidateMessage method listens to the candidates.queue queue. This was the queue we configured earlier. The method is annotated with theSpring AMQP Rabbit @RabbitListener annotation.

The getCandidateMessage retrieves the new candidate object from the message, deserializes the message’s JSON payload, maps it to the candidate object model and saves it to the Voter service’s database.

The second method, getCandidatesQueueDb, retrieves the candidates from the Voter service’s database. The method makes use of the Spring Data MongoDB Aggregation package to return a list of candidates from MongoDB.

RabbitMQ Management Console

The easiest way to observe what is happening with the messages is using the RabbitMQ Management Console. To access the console, point your web browser to localhost, on port 15672. The default login credentials for the console are guest/guest. As you successfully produce and consume messages with RabbitMQ, you should see activity on the Overview tab.

RabbitMQ_EC_Durable3.png

Recall we said the queue, in this example, was durable. That means messages will survive the RabbitMQ broker stopping and starting. In the below view of the RabbitMQ Management Console, note the six messages persisted in memory. The Candidate service produced the messages in response to six new candidates being created. However, the Voter service was not running, and therefore, could not consume the messages. In addition, the RabbitMQ server was restarted, after receiving the candidate messages. The messages were persisted and still present in the queue after the successful reboot of RabbitMQ.

RabbitMQ_EC_Durable

Once RabbitMQ and the Voter service instance were back online, the Voter service successfully consumed the six waiting messages from the queue.

RabbitMQ_EC_Durable2.png

Service Logs

In addition to using the RabbitMQ Management Console, we may obverse communications between the two services by looking at the Voter and Candidate service’s logs. I have grabbed a snippet of both service’s logs and added a few comments to show where different processes are being executed.

First the Candidate service logs. We observe a REST HTTP POST request containing a new candidate. We then observe the creation of the new candidate object in the Candidate service’s database, followed by the event handler publishing a message on the queue. Finally, we observe the response is returned in reply to the initial REST HTTP POST request.

Now the Voter service logs. At the exact same second as the message and the response sent by the Candidate service, the Voter service consumes the message off the queue. The Voter service then deserializes the new candidate object and inserts it into its database.

MongoDB

Using the mongo Shell, we can observe six new 2016 Presidential Election candidates in the Candidate service’s database.

Now, looking at the Voter service’s database, we should find the same six 2016 Presidential Election candidates. Note the Object IDs are the same between the two service’s document sets, as are the rest of the fields (first name, last name, political party, and election). However, the class field is different between the two service’s records.

Production Considerations

The post demonstrated a simple example of message-based, event-driven eventual consistency. In an actual Production environment, there are a few things that must be considered.

  • We only addressed a ‘candidate created’ event. We would also have to code for other types of events, such as a ‘candidate deleted’ event and a ‘candidate updated’ event.
  • If a candidate is added, deleted, then re-added, are the events published and consumed in the right order? What about with multiple instances of the Voter service running? Does this pattern guarantee event ordering?
  • How should the Candidate service react on startup if RabbitMQ is not available
  • What if RabbitMQ fails after the Candidate services have started?
  • How should the Candidate service react if a new candidate record is added to the database, but a ‘candidate created’ event message cannot be published to RabbitMQ? The two actions are not wrapped in a single transaction.
  • In all of the above scenarios, what response should be returned to the API end user?

Conclusion

In this post, using eventual consistency, we successfully decoupled our two microservices and achieved asynchronous inter-process communications. Adopting a message-based, event-driven, loosely-coupled architecture, wherever possible, in combination with REST HTTP when it makes sense, will improve the overall manageability and scalability of a microservices-based platform.

References

All opinions in this post are my own and not necessarily the views of my current employer or their clients.

, , , , , , , , , , ,

3 Comments