Automating Multi-Environment Kubernetes Virtual Clusters with Google Cloud DNS, Auth0, and Istio 1.0

Kubernetes supports multiple virtual clusters within the same physical cluster. These virtual clusters are called Namespaces. Namespaces are a way to divide cluster resources between multiple users. Many enterprises use Namespaces to divide the same physical Kubernetes cluster into different virtual software development environments as part of their overall Software Development Lifecycle (SDLC). This practice is commonly used in ‘lower environments’ or ‘non-prod’ (not Production) environments. These environments commonly include Continous Integration and Delivery (CI/CD), Development, Integration, Testing/Quality Assurance (QA), User Acceptance Testing (UAT), Staging, Demo, and Hotfix. Namespaces provide a basic form of what is referred to as soft multi-tenancy.

Generally, the security boundaries and performance requirements between non-prod environments, within the same enterprise, are less restrictive than Production or Disaster Recovery (DR) environments. This allows for multi-tenant environments, while Production and DR are normally single-tenant environments. In order to approximate the performance characteristics of Production, the Performance Testing environment is also often isolated to a single-tenant. A typical enterprise would minimally have a non-prod, performance, production, and DR environment.

Using Namespaces to create virtual separation on the same physical Kubernetes cluster provides enterprises with more efficient use of virtual compute resources, reduces Cloud costs, eases the management burden, and often expedites and simplifies the release process.

Demonstration

In this post, we will re-examine the topic of virtual clusters, similar to the recent post, Managing Applications Across Multiple Kubernetes Environments with Istio: Part 1 and Part 2. We will focus specifically on automating the creation of the virtual clusters on GKE with Istio 1.0, managing the Google Cloud DNS records associated with the cluster’s environments, and enabling both HTTPS and token-based OAuth access to each environment. We will use the Storefront API for our demonstration, featured in the previous three posts, including Building a Microservices Platform with Confluent Cloud, MongoDB Atlas, Istio, and Google Kubernetes Engine.

gke-routing.png

Source Code

The source code for this post may be found on the gke branch of the storefront-kafka-docker GitHub repository.

git clone --branch gke --single-branch --depth 1 --no-tags \
  https://github.com/garystafford/storefront-kafka-docker.git

Source code samples in this post are displayed as GitHub Gists, which may not display correctly on all mobile and social media browsers, such as LinkedIn.

This project contains all the code to deploy and configure the GKE cluster and Kubernetes resources.

Screen Shot 2019-01-19 at 11.49.31 AM.png

To follow along, you will need to register your own domain, arrange for an Auth0, or alternative, authentication and authorization service, and obtain an SSL/TLS certificate.

SSL/TLS Wildcard Certificate

In the recent post, Securing Your Istio Ingress Gateway with HTTPS, we examined how to create and apply an SSL/TLS certificate to our GKE cluster, to secure communications. Although we are only creating a non-prod cluster, it is more and more common to use SSL/TLS everywhere, especially in the Cloud. For this post, I have registered a single wildcard certificate, *.api.storefront-demo.com. This certificate will cover the three second-level subdomains associated with the virtual clusters: dev.api.storefront-demo.com, test.api.storefront-demo.com, and uat.api.storefront-demo.com. Setting the environment name, such as dev.*, as the second-level subdomain of my storefront-demo domain, following the first level api.* subdomain, makes the use of a wildcard certificate much easier.

screen_shot_2019-01-13_at_10.04.23_pm

As shown below, my wildcard certificate contains the Subject Name and Subject Alternative Name (SAN) of *.api.storefront-demo.com. For Production, api.storefront-demo.com, I prefer to use a separate certificate.

screen_shot_2019-01-13_at_10.36.33_pm_detail

Create GKE Cluster

With your certificate in hand, create the non-prod Kubernetes cluster. Below, the script creates a minimally-sized, three-node, multi-zone GKE cluster, running on GCP, with Kubernetes Engine cluster version 1.11.5-gke.5 and Istio on GKE version 1.0.3-gke.0. I have enabled the master authorized networks option to secure my GKE cluster master endpoint. For the demo, you can add your own IP address CIDR on line 9 (i.e. 1.2.3.4/32), or remove lines 30 – 31 to remove the restriction (gist).

  • Lines 16–39: Create a 3-node, multi-zone GKE cluster with Istio;
  • Line 48: Creates three non-prod Namespaces: dev, test, and uat;
  • Lines 51–53: Enable Istio automatic sidecar injection within each Namespace;


#!/bin/bash
#
# author: Gary A. Stafford
# site: https://programmaticponderings.com
# license: MIT License
# purpose: Create non-prod Kubernetes cluster on GKE
# Constants – CHANGE ME!
readonly PROJECT='gke-confluent-atlas'
readonly CLUSTER='storefront-api-non-prod'
readonly REGION='us-central1'
readonly MASTER_AUTH_NETS='<your_ip_cidr>'
readonly NAMESPACES=( 'dev' 'test' 'uat' )
# Build a 3-node, single-region, multi-zone GKE cluster
time gcloud beta container \
–project $PROJECT clusters create $CLUSTER \
–region $REGION \
–no-enable-basic-auth \
–no-issue-client-certificate \
–cluster-version "1.11.5-gke.5" \
–machine-type "n1-standard-2" \
–image-type "COS" \
–disk-type "pd-standard" \
–disk-size "100" \
–scopes "https://www.googleapis.com/auth/devstorage.read_only","https://www.googleapis.com/auth/logging.write","https://www.googleapis.com/auth/monitoring","https://www.googleapis.com/auth/servicecontrol","https://www.googleapis.com/auth/service.management.readonly","https://www.googleapis.com/auth/trace.append" \
–num-nodes "1" \
–enable-stackdriver-kubernetes \
–enable-ip-alias \
–enable-master-authorized-networks \
–master-authorized-networks $MASTER_AUTH_NETS \
–network "projects/${PROJECT}/global/networks/default" \
–subnetwork "projects/${PROJECT}/regions/${REGION}/subnetworks/default" \
–default-max-pods-per-node "110" \
–addons HorizontalPodAutoscaling,HttpLoadBalancing,Istio \
–istio-config auth=MTLS_STRICT \
–metadata disable-legacy-endpoints=true \
–enable-autoupgrade \
–enable-autorepair
# Get cluster creds
gcloud container clusters get-credentials $CLUSTER \
–region $REGION –project $PROJECT
kubectl config current-context
# Create Namespaces
kubectl apply -f ./resources/other/namespaces.yaml
# Enable automatic Istio sidecar injection
for namespace in ${NAMESPACES[@]}; do
kubectl label namespace $namespace istio-injection=enabled
done

If successful, the results should look similar to the output, below.

screen_shot_2019-01-15_at_11.51.08_pm

The cluster will contain a pool of three minimally-sized VMs, the Kubernetes nodes.

screen_shot_2019-01-16_at_12.06.03_am

Deploying Resources

The Istio Gateway and three ServiceEntry resources are the primary resources responsible for routing the traffic from the ingress router to the Services, within the multiple Namespaces. Both of these resource types are new to Istio 1.0 (gist).

  • Lines 9–16: Port config that only accepts HTTPS traffic on port 443 using TLS;
  • Lines 18–20: The three subdomains being routed to the non-prod GKE cluster;
  • Lines 28, 63, 98: The three subdomains being routed to the non-prod GKE cluster;
  • Lines 39, 47, 65, 74, 82, 90, 109, 117, 125: Routing to FQDN of Storefront API Services within the three Namespaces;


apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: storefront-gateway
spec:
selector:
istio: ingressgateway
servers:
port:
number: 443
name: https
protocol: HTTPS
tls:
mode: SIMPLE
serverCertificate: /etc/istio/ingressgateway-certs/tls.crt
privateKey: /etc/istio/ingressgateway-certs/tls.key
hosts:
dev.api.storefront-demo.com
test.api.storefront-demo.com
uat.api.storefront-demo.com
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: storefront-dev
spec:
hosts:
dev.api.storefront-demo.com
gateways:
storefront-gateway
http:
match:
uri:
prefix: /accounts
route:
destination:
port:
number: 8080
host: accounts.dev.svc.cluster.local
match:
uri:
prefix: /fulfillment
route:
destination:
port:
number: 8080
host: fulfillment.dev.svc.cluster.local
match:
uri:
prefix: /orders
route:
destination:
port:
number: 8080
host: orders.dev.svc.cluster.local
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: storefront-test
spec:
hosts:
test.api.storefront-demo.com
gateways:
storefront-gateway
http:
match:
uri:
prefix: /accounts
route:
destination:
port:
number: 8080
host: accounts.test.svc.cluster.local
match:
uri:
prefix: /fulfillment
route:
destination:
port:
number: 8080
host: fulfillment.test.svc.cluster.local
match:
uri:
prefix: /orders
route:
destination:
port:
number: 8080
host: orders.test.svc.cluster.local
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: storefront-uat
spec:
hosts:
uat.api.storefront-demo.com
gateways:
storefront-gateway
http:
match:
uri:
prefix: /accounts
route:
destination:
port:
number: 8080
host: accounts.uat.svc.cluster.local
match:
uri:
prefix: /fulfillment
route:
destination:
port:
number: 8080
host: fulfillment.uat.svc.cluster.local
match:
uri:
prefix: /orders
route:
destination:
port:
number: 8080
host: orders.uat.svc.cluster.local

Next, deploy the Istio and Kubernetes resources to the new GKE cluster. For the sake of brevity, we will deploy the same number of instances and the same version of each the three Storefront API services (Accounts, Orders, Fulfillment) to each of the three non-prod environments (dev, test, uat). In reality, you would have varying numbers of instances of each service, and each environment would contain progressive versions of each service, as part of the SDLC of each microservice (gist).

  • Lines 13–14: Deploy the SSL/TLS certificate and the private key;
  • Line 17: Deploy the Istio Gateway and three ServiceEntry resources;
  • Lines 20–22: Deploy the Istio Authentication Policy resources each Namespace;
  • Lines 26–37: Deploy the same set of resources to the dev, test, and uat Namespaces;


#!/bin/bash
#
# author: Gary A. Stafford
# site: https://programmaticponderings.com
# license: MIT License
# purpose: Deploy Kubernetes/Istio resources
# Constants – CHANGE ME!
readonly CERT_PATH=~/Documents/Articles/gke-kafka/sslforfree_non_prod
readonly NAMESPACES=( 'dev' 'test' 'uat' )
# Kubernetes Secret to hold the server’s certificate and private key
kubectl create -n istio-system secret tls istio-ingressgateway-certs \
–key $CERT_PATH/private.key –cert $CERT_PATH/certificate.crt
# Istio Gateway and three ServiceEntry resources
kubectl apply -f ./resources/other/istio-gateway.yaml
# End-user auth applied per environment
kubectl apply -f ./resources/other/auth-policy-dev.yaml
kubectl apply -f ./resources/other/auth-policy-test.yaml
kubectl apply -f ./resources/other/auth-policy-uat.yaml
# Loop through each non-prod Namespace (environment)
# Re-use same resources (incld. credentials) for all environments, just for the demo
for namespace in ${NAMESPACES[@]}; do
kubectl apply -n $namespace -f ./resources/config/confluent-cloud-kafka-configmap.yaml
kubectl apply -n $namespace -f ./resources/config/mongodb-atlas-secret.yaml
kubectl apply -n $namespace -f ./resources/config/confluent-cloud-kafka-secret.yaml
kubectl apply -n $namespace -f ./resources/other/mongodb-atlas-external-mesh.yaml
kubectl apply -n $namespace -f ./resources/other/confluent-cloud-external-mesh.yaml
kubectl apply -n $namespace -f ./resources/services/accounts.yaml
kubectl apply -n $namespace -f ./resources/services/fulfillment.yaml
kubectl apply -n $namespace -f ./resources/services/orders.yaml
done

The deployed Storefront API Services should look as follows.

screen_shot_2019-01-13_at_7.16.03_pm

Google Cloud DNS

Next, we need to enable DNS access to the GKE cluster using Google Cloud DNS. According to Google, Cloud DNS is a scalable, reliable and managed authoritative Domain Name System (DNS) service running on the same infrastructure as Google. It has low latency, high availability, and is a cost-effective way to make your applications and services available to your users.

Whenever a new GKE cluster is created, a new Network Load Balancer is also created. By default, the load balancer’s front-end is an external IP address.

screen_shot_2019-01-15_at_11.56.01_pm.png

Using a forwarding rule, traffic directed at the external IP address is redirected to the load balancer’s back-end. The load balancer’s back-end is comprised of three VM instances, which are the three Kubernete nodes in the GKE cluster.

screen_shot_2019-01-15_at_11.56.19_pm

If you are following along with this post’s demonstration, we will assume you have a domain registered and configured with Google Cloud DNS. I am using the storefront-demo.com domain, which I have used in the last three posts to demonstrate Istio and GKE.

Google Cloud DNS has a fully functional web console, part of the Google Cloud Console. However, using the Cloud DNS web console is impractical in a DevOps CI/CD workflow, where Kubernetes clusters, Namespaces, and Workloads are ephemeral. Therefore we will use the following script. Within the script, we reset the IP address associated with the A records for each non-prod subdomains associated with storefront-demo.com domain (gist).

  • Lines 23–25: Find the previous load balancer’s front-end IP address;
  • Lines 27–29: Find the new load balancer’s front-end IP address;
  • Line 35: Start the Cloud DNS transaction;
  • Lines 37–47: Add the DNS record changes to the transaction;
  • Line 49: Execute the Cloud DNS transaction;


#!/bin/bash
#
# author: Gary A. Stafford
# site: https://programmaticponderings.com
# license: MIT License
# purpose: Update Cloud DNS A Records
# Constants – CHANGE ME!
readonly PROJECT='gke-confluent-atlas'
readonly DOMAIN='storefront-demo.com'
readonly ZONE='storefront-demo-com-zone'
readonly REGION='us-central1'
readonly TTL=300
readonly RECORDS=('dev' 'test' 'uat')
# Make sure any old load balancers were removed
if [ $(gcloud compute forwarding-rules list –filter "region:($REGION)" | wc -l | awk '{$1=$1};1') -gt 2 ]; then
echo "More than one load balancer detected, exiting script."
exit 1
fi
# Get load balancer IP address from first record
readonly OLD_IP=$(gcloud dns record-sets list \
–filter "name=${RECORDS[0]}.api.${DOMAIN}." –zone $ZONE \
| awk 'NR==2 {print $4}')
readonly NEW_IP=$(gcloud compute forwarding-rules list \
–filter "region:($REGION)" \
| awk 'NR==2 {print $3}')
echo "Old LB IP Address: ${OLD_IP}"
echo "New LB IP Address: ${NEW_IP}"
# Update DNS records
gcloud dns record-sets transaction start –zone $ZONE
for record in ${RECORDS[@]}; do
echo "${record}.api.${DOMAIN}."
gcloud dns record-sets transaction remove \
–name "${record}.api.${DOMAIN}." –ttl $TTL \
–type A –zone $ZONE "${OLD_IP}"
gcloud dns record-sets transaction add \
–name "${record}.api.${DOMAIN}." –ttl $TTL \
–type A –zone $ZONE "${NEW_IP}"
done
gcloud dns record-sets transaction execute –zone $ZONE

The outcome of the script is shown below. Note how changes are executed as part of a transaction, by automatically creating a transaction.yaml file. The file contains the six DNS changes, three additions and three deletions. The command executes the transaction and then deletes the transaction.yaml file.

> sh ./part3_set_cloud_dns.sh
Old LB IP Address: 35.193.208.115
New LB IP Address: 35.238.196.231

Transaction started [transaction.yaml].

dev.api.storefront-demo.com.
Record removal appended to transaction at [transaction.yaml].
Record addition appended to transaction at [transaction.yaml].

test.api.storefront-demo.com.
Record removal appended to transaction at [transaction.yaml].
Record addition appended to transaction at [transaction.yaml].

uat.api.storefront-demo.com.
Record removal appended to transaction at [transaction.yaml].
Record addition appended to transaction at [transaction.yaml].

Executed transaction [transaction.yaml] for managed-zone [storefront-demo-com-zone].
Created [https://www.googleapis.com/dns/v1/projects/gke-confluent-atlas/managedZones/storefront-demo-com-zone/changes/53].

ID  START_TIME                STATUS
55  2019-01-16T04:54:14.984Z  pending

Based on my own domain and cluster details, the transaction.yaml file looks as follows. Again, note the six DNS changes, three additions, followed by three deletions (gist).


additions:
kind: dns#resourceRecordSet
name: storefront-demo.com.
rrdatas:
ns-cloud-a1.googledomains.com. cloud-dns-hostmaster.google.com. 25 21600 3600
259200 300
ttl: 21600
type: SOA
kind: dns#resourceRecordSet
name: dev.api.storefront-demo.com.
rrdatas:
35.238.196.231
ttl: 300
type: A
kind: dns#resourceRecordSet
name: test.api.storefront-demo.com.
rrdatas:
35.238.196.231
ttl: 300
type: A
kind: dns#resourceRecordSet
name: uat.api.storefront-demo.com.
rrdatas:
35.238.196.231
ttl: 300
type: A
deletions:
kind: dns#resourceRecordSet
name: storefront-demo.com.
rrdatas:
ns-cloud-a1.googledomains.com. cloud-dns-hostmaster.google.com. 24 21600 3600
259200 300
ttl: 21600
type: SOA
kind: dns#resourceRecordSet
name: dev.api.storefront-demo.com.
rrdatas:
35.193.208.115
ttl: 300
type: A
kind: dns#resourceRecordSet
name: test.api.storefront-demo.com.
rrdatas:
35.193.208.115
ttl: 300
type: A
kind: dns#resourceRecordSet
name: uat.api.storefront-demo.com.
rrdatas:
35.193.208.115
ttl: 300
type: A

Confirm DNS Changes

Use the dig command to confirm the DNS records are now correct and that DNS propagation has occurred. The IP address returned by dig should be the external IP address assigned to the front-end of the Google Cloud Load Balancer.

> dig dev.api.storefront-demo.com +short
35.238.196.231

Or, all the three records.

echo \
  "dev.api.storefront-demo.com\n" \
  "test.api.storefront-demo.com\n" \
  "uat.api.storefront-demo.com" \
  > records.txt | dig -f records.txt +short

35.238.196.231
35.238.196.231
35.238.196.231

Optionally, more verbosely by removing the +short option.

> dig +nocmd dev.api.storefront-demo.com

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30763
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;dev.api.storefront-demo.com.   IN  A

;; ANSWER SECTION:
dev.api.storefront-demo.com. 299 IN A   35.238.196.231

;; Query time: 27 msec
;; SERVER: 8.8.8.8#53(8.8.8.8)
;; WHEN: Wed Jan 16 18:00:49 EST 2019
;; MSG SIZE  rcvd: 72

The resulting records in the Google Cloud DNS management console should look as follows.

screen_shot_2019-01-15_at_11.57.12_pm

JWT-based Authentication

As discussed in the previous post, Istio End-User Authentication for Kubernetes using JSON Web Tokens (JWT) and Auth0, it is typical to limit restrict access to the Kubernetes cluster, Namespaces within the cluster, or Services running within Namespaces to end-users, whether they are humans or other applications. In that previous post, we saw an example of applying a machine-to-machine (M2M) Istio Authentication Policy to only the uat Namespace. This scenario is common when you want to control access to resources in non-production environments, such as UAT, to outside test teams, accessing the uat Namespace through an external application. To simulate this scenario, we will apply the following Istio Authentication Policy to the uat Namespace. (gist).


apiVersion: authentication.istio.io/v1alpha1
kind: Policy
metadata:
name: default
namespace: uat
spec:
peers:
mtls: {}
origins:
jwt:
audiences:
"storefront-api-uat"
issuer: "https://storefront-demo.auth0.com/"
jwksUri: "https://storefront-demo.auth0.com/.well-known/jwks.json"
principalBinding: USE_ORIGIN

For the dev and test Namespaces, we will apply an additional, different Istio Authentication Policy. This policy will protect against the possibility of dev and test M2M API consumers interfering with uat M2M API consumers and vice-versa. Below is the dev and test version of the Policy (gist).


apiVersion: authentication.istio.io/v1alpha1
kind: Policy
metadata:
name: default
namespace: dev
spec:
peers:
mtls: {}
origins:
jwt:
audiences:
"storefront-api-dev-test"
issuer: "https://storefront-demo.auth0.com/"
jwksUri: "https://storefront-demo.auth0.com/.well-known/jwks.json"
principalBinding: USE_ORIGIN

Testing Authentication

Using Postman, with the ‘Bearer Token’ type authentication method, as detailed in the previous post, a call a Storefront API resource in the uat Namespace should succeed. This also confirms DNS and HTTPS are working properly.

screen_shot_2019-01-15_at_11.58.41_pm

The dev and test Namespaces require different authentication. Trying to use no Authentication, or authenticating as a UAT API consumer, will result in a 401 Unauthorized HTTP status, along with the Origin authentication failed. error message.

screen_shot_2019-01-16_at_12.00.55_am

Conclusion

In this brief post, we demonstrated how to create a GKE cluster with Istio 1.0.x, containing three virtual clusters, or Namespaces. Each Namespace represents an environment, which is part of an application’s SDLC. We enforced HTTP over TLS (HTTPS) using a wildcard SSL/TLS certificate. We also enforced end-user authentication using JWT-based OAuth 2.0 with Auth0. Lastly, we provided user-friendly DNS routing to each environment, using Google Cloud DNS. Short of a fully managed API Gateway, like Apigee, and automating the execution of the scripts with Jenkins or Spinnaker, this cluster is ready to provide a functional path to Production for developing our Storefront API.

All opinions expressed in this post are my own and not necessarily the views of my current or past employers or their clients.

, , , , , , , , , , ,

  1. #1 by Stamatis Panorgios on March 20, 2019 - 11:54 am

    Amazing example. How email alert can be setup when Grafana is through Istio (Helm)? Thank you.

  2. #2 by vikki on April 14, 2023 - 10:49 am

    Can the pods connect to another pods in same namespace?

  1. Developing on the Google Cloud Platform | Programmatic Ponderings

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: