Archive for category Continuous Delivery

Provision and Deploy a Consul Cluster on AWS, using Terraform, Docker, and Jenkins



Modern DevOps tools, such as HashiCorp’s Packer and Terraform, make it easier to provision and manage complex cloud architecture. Utilizing a CI/CD server, such as Jenkins, to securely automate the use of these DevOps tools, ensures quick and consistent results.

In a recent post, Distributed Service Configuration with Consul, Spring Cloud, and Docker, we built a Consul cluster using Docker swarm mode, to host distributed configurations for a Spring Boot application. The cluster was built locally with VirtualBox. This architecture is fine for development and testing, but not for use in Production.

In this post, we will deploy a highly available three-node Consul cluster to AWS. We will use Terraform to provision a set of EC2 instances and accompanying infrastructure. The instances will be built from a hybrid AMIs containing the new Docker Community Edition (CE). In a recent post, Baking AWS AMI with new Docker CE Using Packer, we provisioned an Ubuntu AMI with Docker CE, using Packer. We will deploy Docker containers to each EC2 host, containing an instance of Consul server.

All source code can be found on GitHub.


I have chosen Jenkins to automate all of the post’s build, provisioning, and deployment tasks. However, none of the code is written specific to Jenkins; you may run all of it from the command line.

For this post, I have built four projects in Jenkins, as follows:

  1. Provision Docker CE AMI: Builds Ubuntu AMI with Docker CE, using Packer
  2. Provision Consul Infra AWS: Provisions Consul infrastructure on AWS, using Terraform
  3. Deploy Consul Cluster AWS: Deploys Consul to AWS, using Docker
  4. Destroy Consul Infra AWS: Destroys Consul infrastructure on AWS, using Terraform

Jenkins UI

We will primarily be using the ‘Provision Consul Infra AWS’, ‘Deploy Consul Cluster AWS’, and ‘Destroy Consul Infra AWS’ Jenkins projects in this post. The fourth Jenkins project, ‘Provision Docker CE AMI’, automates the steps found in the recent post, Baking AWS AMI with new Docker CE Using Packer, to build the AMI used to provision the EC2 instances in this post.

Consul AWS Diagram 2


Using Terraform, we will provision EC2 instances in three different Availability Zones within the US East 1 (N. Virginia) Region. Using Terraform’s Amazon Web Services (AWS) provider, we will create the following AWS resources:

  • (1) Virtual Private Cloud (VPC)
  • (1) Internet Gateway
  • (1) Key Pair
  • (3) Elastic Cloud Compute (EC2) Instances
  • (2) Security Groups
  • (3) Subnets
  • (1) Route
  • (3) Route Tables
  • (3) Route Table Associations

The final AWS architecture should resemble the following:

Consul AWS Diagram

Production Ready AWS

Although we have provisioned a fairly complete VPC for this post, it is far from being ready for Production. I have created two security groups, limiting the ingress and egress to the cluster. However, to further productionize the environment would require additional security hardening. At a minimum, you should consider adding public/private subnets, NAT gateways, network access control list rules (network ACLs), and the use of HTTPS for secure communications.

In production, applications would communicate with Consul through local Consul clients. Consul clients would take part in the LAN gossip pool from different subnets, Availability Zones, Regions, or VPCs using VPC peering. Communications would be tightly controlled by IAM, VPC, subnet, IP address, and port.

Also, you would not have direct access to the Consul UI through a publicly exposed IP or DNS address. Access to the UI would be removed altogether or locked down to specific IP addresses, and accessed restricted to secure communication channels.


We will achieve high availability (HA) by clustering three Consul server nodes across the three Elastic Cloud Compute (EC2) instances. In this minimally sized, three-node cluster of Consul servers, we are protected from the loss of one Consul server node, one EC2 instance, or one Availability Zone(AZ). The cluster will still maintain a quorum of two nodes. An additional level of HA that Consul supports, multiple datacenters (multiple AWS Regions), is not demonstrated in this post.


Having Docker CE already installed on each EC2 instance allows us to execute remote Docker commands over SSH from Jenkins. These commands will deploy and configure a Consul server node, within a Docker container, on each EC2 instance. The containers are built from HashiCorp’s latest Consul Docker image pulled from Docker Hub.

Getting Started

Preliminary Steps

If you have built infrastructure on AWS with Terraform, these steps should be familiar to you:

  1. First, you will need an AMI with Docker. I suggest reading Baking AWS AMI with new Docker CE Using Packer.
  2. You will need an AWS IAM User with the proper access to create the required infrastructure. For this post, I created a separate Jenkins IAM User with PowerUser level access.
  3. You will need to have an RSA public-private key pair, which can be used to SSH into the EC2 instances and install Consul.
  4. Ensure you have your AWS credentials set. I usually source mine from a .env file, as environment variables. Jenkins can securely manage credentials, using secret text or files.
  5. Fork and/or clone the Consul cluster project from  GitHub.
  6. Change the aws_key_name and public_key_path variable values to your own RSA key, in the file
  7. Change the aws_amis_base variable values to your own AMI ID (see step 1)
  8. If you are do not want to use the US East 1 Region and its AZs, modify the,, and files.
  9. Disable Terraform’s remote state or modify the resource to match your remote state configuration, in the file. I am using an Amazon S3 bucket to store my Terraform remote state.

Building an AMI with Docker

If you have not built an Amazon Machine Image (AMI) for use in this post already, you can do so using the scripts provided in the previous post’s GitHub repository. To automate the AMI build task, I built the ‘Provision Docker CE AMI’ Jenkins project. Identical to the other three Jenkins projects in this post, this project has three main tasks, which include: 1) SCM: clone the Packer AMI GitHub project, 2) Bindings: set up the AWS credentials, and 3) Build: run Packer.

The SCM and Bindings tasks are identical to the other projects (see below for details), except for the use of a different GitHub repository. The project’s Build step, which runs the script looks as follows:


The resulting AMI ID will need to be manually placed in Terraform’s file, before provisioning the AWS infrastructure with Terraform. The new AMI ID will be displayed in Jenkin’s build output.


Provisioning with Terraform

Based on the modifications you made in the Preliminary Steps, execute the terraform validate command to confirm your changes. Then, run the terraform plan command to review the plan. Assuming are were no errors, finally, run the terraform apply command to provision the AWS infrastructure components.

In Jenkins, I have created the ‘Provision Consul Infra AWS’ project. This project has three tasks, which include: 1) SCM: clone the GitHub project, 2) Bindings: set up the AWS credentials, and 3) Build: run Terraform. Those tasks look as follows:


You will obviously need to use your modified GitHub project, incorporating the configuration changes detailed above, as the SCM source for Jenkins.

Jenkins Credentials

You will also need to configure your AWS credentials.


The script provisions the AWS infrastructure using Terraform. The script also updates Terraform’s remote state. Remember to update the remote state configuration in the script to match your personal settings.

cd tf_env_aws/

terraform remote config \
  -backend=s3 \
  -backend-config="bucket=your_bucket" \
  -backend-config="key=terraform_consul.tfstate" \

terraform plan
terraform apply

The Jenkins build output should look similar to the following:


Although the build only takes about 90 seconds to complete, the EC2 instances could take a few extra minutes to complete their Status Checks and be completely ready. The final results in the AWS EC2 Management Console should look as follows:

EC2 Management Console

Note each EC2 instance is running in a different US East 1 Availability Zone.

Installing Consul

Once the AWS infrastructure is running and the EC2 instances have completed their Status Checks successfully, we are ready to deploy Consul. In Jenkins, I have created the ‘Deploy Consul Cluster AWS’ project. This project has three tasks, which include: 1) SCM: clone the GitHub project, 2) Bindings: set up the AWS credentials, and 3) Build: run an SSH remote Docker command on each EC2 instance to deploy Consul. The SCM and Bindings tasks are identical to the project above. The project’s Build step looks as follows:


First, the script deletes any previous instances of Consul containers. This is helpful if you need to re-deploy Consul. Next, the script executes a series of SSH remote Docker commands to install and configure Consul on each EC2 instance.

export ec2_server1_private_ip=$(aws ec2 describe-instances \
  --filters Name='tag:Name,Values=tf-instance-consul-server-1' \
  --output text --query 'Reservations[*].Instances[*].PrivateIpAddress')
echo "consul-server-1 private ip: ${ec2_server1_private_ip}"
ec2_public_ip=$(aws ec2 describe-instances \
  --filters Name='tag:Name,Values=tf-instance-consul-server-1' \
  --output text --query 'Reservations[*].Instances[*].PublicIpAddress')

ssh -oStrictHostKeyChecking=no -T \
  -i ~/.ssh/consul_aws_rsa \
  ubuntu@${ec2_public_ip} << EOSSH
  docker run -d \
    --net=host \
    --hostname ${consul_server} \
    --name ${consul_server} \
    --env "SERVICE_IGNORE=true" \
    --env "CONSUL_CLIENT_INTERFACE=eth0" \
    --env "CONSUL_BIND_INTERFACE=eth0" \
    --volume /home/ubuntu/consul/data:/consul/data \
    --publish 8500:8500 \
    consul:latest \
    consul agent -server -ui -client= \
      -bootstrap-expect=3 \
      -advertise='{{ GetInterfaceIP "eth0" }}' \

  sleep 5
  docker logs consul-server-1
  docker exec -i consul-server-1 consul members
ec2_public_ip=$(aws ec2 describe-instances \
  --filters Name='tag:Name,Values=tf-instance-consul-server-2' \
  --output text --query 'Reservations[*].Instances[*].PublicIpAddress')

ssh -oStrictHostKeyChecking=no -T \
  -i ~/.ssh/consul_aws_rsa \
  ubuntu@${ec2_public_ip} << EOSSH
  docker run -d \
    --net=host \
    --hostname ${consul_server} \
    --name ${consul_server} \
    --env "SERVICE_IGNORE=true" \
    --env "CONSUL_CLIENT_INTERFACE=eth0" \
    --env "CONSUL_BIND_INTERFACE=eth0" \
    --volume /home/ubuntu/consul/data:/consul/data \
    --publish 8500:8500 \
    consul:latest \
    consul agent -server -ui -client= \
      -advertise='{{ GetInterfaceIP "eth0" }}' \
      -retry-join="${ec2_server1_private_ip}" \

  sleep 5
  docker logs consul-server-2
  docker exec -i consul-server-2 consul members
ec2_public_ip=$(aws ec2 describe-instances \
  --filters Name='tag:Name,Values=tf-instance-consul-server-3' \
  --output text --query 'Reservations[*].Instances[*].PublicIpAddress')

ssh -oStrictHostKeyChecking=no -T \
  -i ~/.ssh/consul_aws_rsa \
  ubuntu@${ec2_public_ip} << EOSSH
  docker run -d \
    --net=host \
    --hostname ${consul_server} \
    --name ${consul_server} \
    --env "SERVICE_IGNORE=true" \
    --env "CONSUL_CLIENT_INTERFACE=eth0" \
    --env "CONSUL_BIND_INTERFACE=eth0" \
    --volume /home/ubuntu/consul/data:/consul/data \
    --publish 8500:8500 \
    consul:latest \
    consul agent -server -ui -client= \
      -advertise='{{ GetInterfaceIP "eth0" }}' \
      -retry-join="${ec2_server1_private_ip}" \

  sleep 5
  docker logs consul-server-3
  docker exec -i consul-server-3 consul members
ec2_public_ip=$(aws ec2 describe-instances \
  --filters Name='tag:Name,Values=tf-instance-consul-server-1' \
  --output text --query 'Reservations[*].Instances[*].PublicIpAddress')
echo " "
echo "*** Consul UI: http://${ec2_public_ip}:8500/ui/ ***"

The entire Jenkins build process only takes about 30 seconds. Afterward, the output from a successful Jenkins build should show that all three Consul server instances are running, have formed a quorum, and have elected a Leader.


Persisting State

The Consul Docker image exposes VOLUME /consul/data, which is a path were Consul will place its persisted state. Using Terraform’s remote-exec provisioner, we create a directory on each EC2 instance, at /home/ubuntu/consul/config. The docker run command bind-mounts the container’s /consul/data path to the EC2 host’s /home/ubuntu/consul/config directory.

According to Consul, the Consul server container instance will ‘store the client information plus snapshots and data related to the consensus algorithm and other state, like Consul’s key/value store and catalog’ in the /consul/data directory. That container directory is now bind-mounted to the EC2 host, as demonstrated below.


Accessing Consul

Following a successful deployment, you should be able to use the public URL, displayed in the build output of the ‘Deploy Consul Cluster AWS’ project, to access the Consul UI. Clicking on the Nodes tab in the UI, you should see all three Consul server instances, one per EC2 instance, running and healthy.

Consul UI

Destroying Infrastructure

When you are finished with the post, you may want to remove the running infrastructure, so you don’t continue to get billed by Amazon. The ‘Destroy Consul Infra AWS’ project destroys all the AWS infrastructure, provisioned as part of this post, in about 60 seconds. The project’s SCM and Bindings tasks are identical to the both previous projects. The Build step calls the script, which is included in the GitHub project. The script executes the terraform destroy -force command. It will delete all running infrastructure components associated with the post and update Terraform’s remote state.



This post has demonstrated how modern DevOps tooling, such as HashiCorp’s Packer and Terraform, make it easy to build, provision and manage complex cloud architecture. Using a CI/CD server, such as Jenkins, to securely automate the use of these tools, ensures quick and consistent results.

, , , , , , , , , ,

Leave a comment

Infrastructure as Code Maturity Model

Systematically Evolving an Organization’s Infrastructure

Infrastructure and software development teams are increasingly building and managing infrastructure using automated tools that have been described as “infrastructure as code.” – Kief Morris (Infrastructure as Code)

The process of managing and provisioning computing infrastructure and their configuration through machine-processable, declarative, definition files, rather than physical hardware configuration or the use of interactive configuration tools. – Wikipedia (abridged)

Convergence of CD, Cloud, and IaC

In 2011, co-authors Jez Humble, formerly of ThoughtWorks, and David Farley, published their ground-breaking book, Continuous Delivery. Humble and Farley’s book set out, in their words, to automate the ‘painful, risky, and time-consuming process’ of the software ‘build, deployment, and testing process.


Over the next five years, Humble and Farley’s Continuous Delivery made a significant contribution to the modern phenomena of DevOps. According to Wikipedia, DevOps is the ‘culture, movement or practice that emphasizes the collaboration and communication of both software developers and other information-technology (IT) professionals while automating the process of software delivery and infrastructure changes.

In parallel with the growth of DevOps, Cloud Computing continued to grow at an explosive rate. Amazon pioneered modern cloud computing in 2006 with the launch of its Elastic Compute Cloud. Two years later, in 2008, Microsoft launched its cloud platform, Azure. In 2010, Rackspace launched OpenStack.

Today, there is a flock of ‘cloud’ providers. Their services fall into three primary service models: Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). Since we will be discussing infrastructure, we will focus on IaaS and PaaS. Leaders in this space include Google Cloud Platform, RedHat, Oracle Cloud, Pivotal Cloud Foundry, CenturyLink Cloud, Apprenda, IBM SmartCloud Enterprise, and Heroku, to mention just a few.

Finally, fast forward to June 2016, O’Reilly releases Infrastructure as Code
Managing Servers in the Cloud
, by Kief Morris, ThoughtWorks. This crucial work bridges many of the concepts first introduced in Humble and Farley’s Continuous Delivery, with the evolving processes and practices to support cloud computing.


This post examines how to apply the principles found in the Continuous Delivery Maturity Model, an analysis tool detailed in Humble and Farley’s Continuous Delivery, and discussed herein, to the best practices found in Morris’ Infrastructure as Code.

Infrastructure as Code

Before we continue, we need a shared understanding of infrastructure as code. Below are four examples of infrastructure as code, as Wikipedia defined them, ‘machine-processable, declarative, definition files.’ The code was written using four popular tools, including HashiCorp Packer, Docker, AWS CloudFormation, and HashiCorp Terraform. Executing the code provisions virtualized cloud infrastructure.

HashiCorp Packer

Packer definition of an AWS EBS-backed AMI, based on Ubuntu.

  "variables": {
    "aws_access_key": "",
    "aws_secret_key": ""
  "builders": [{
    "type": "amazon-ebs",
    "access_key": "{{user `aws_access_key`}}",
    "secret_key": "{{user `aws_secret_key`}}",
    "region": "us-east-1",
    "source_ami": "ami-fce3c696",
    "instance_type": "t2.micro",
    "ssh_username": "ubuntu",
    "ami_name": "packer-example {{timestamp}}"


Dockerfile, used to create a Docker image, and subsequently a Docker container, running MongoDB.

FROM ubuntu:16.04
RUN apt-key adv --keyserver hkp:// --recv EA312927
RUN echo "deb" \
$(cat /etc/lsb-release | grep DISTRIB_CODENAME | cut -d= -f2)/mongodb-org/3.2 multiverse" | \
tee /etc/apt/sources.list.d/mongodb-org-3.2.list
RUN apt-get update && apt-get install -y mongodb-org
RUN mkdir -p /data/db
EXPOSE 27017
ENTRYPOINT ["/usr/bin/mongod"]

AWS CloudFormation

AWS CloudFormation declaration for three services enabled on a running instance.

      enabled: "true"
      ensureRunning: "true"
        - "/etc/nginx/nginx.conf"
        - "/var/www/html"
      enabled: "true"
      ensureRunning: "true"
          - "php"
          - "spawn-fcgi"
      enabled: "false"
      ensureRunning: "false"

HashiCorp Terraform

Terraform definition of an AWS m1.small EC2 instance, running NGINX on Ubuntu.

resource "aws_instance" "web" {
  connection { user = "ubuntu" }
instance_type = "m1.small"
Ami = "${lookup(var.aws_amis, var.aws_region)}"
Key_name = "${}"
vpc_security_group_ids = ["${}"]
Subnet_id = "${}"
provisioner "remote-exec" {
  inline = [
    "sudo apt-get -y update",
    "sudo apt-get -y install nginx",
    "sudo service nginx start",

Cloud-based Infrastructure as a Service

The previous examples provide but the narrowest of views into the potential breadth of infrastructure as code. Leading cloud providers, such as Amazon and Microsoft, offer hundreds of unique offerings, most of which may be defined and manipulated through code — infrastructure as code.



What Infrastructure as Code?

The question many ask is, what types of infrastructure can be defined as code? Although vendors and cloud providers have their unique names and descriptions, most infrastructure is divided into a few broad categories:

  • Compute
  • Databases, Caching, and Messaging
  • Storage, Backup, and Content Delivery
  • Networking
  • Security and Identity
  • Monitoring, Logging, and Analytics
  • Management Tooling

Continuous Delivery Maturity Model

We also need a common understanding of the Continuous Delivery Maturity Model. According to Humble and Farley, the Continuous Delivery Maturity Model was distilled as a model that ‘helps to identify where an organization stands in terms of the maturity of its processes and practices and defines a progression that an organization can work through to improve.

The Continuous Delivery Maturity Model is a 5×6 matrix, consisting of six areas of practice and five levels of maturity. Each of the matrix’s 30 elements defines a required discipline an organization needs to follow, to be considered at that level of maturity within that practice.

Areas of Practice

The CD Maturity Model examines six broad areas of practice found in most enterprise software organizations:

  • Build Management and Continuous Integration
  • Environments and Deployment
  • Release Management and Compliance
  • Testing
  • Data Management
  • Configuration Management

Levels of Maturity

The CD Maturity Model defines five level of increasing maturity, from a score of -1 to 3, from Regressive to Optimizing:

  • Level 3: Optimizing – Focus on process improvement
  • Level 2: Quantitatively Managed – Process measured and controlled
  • Level 1: Consistent – Automated processes applied across whole application lifecycle
  • Level 0: Repeatable – Process documented and partly automated
  • Level -1: Regressive – Processes unrepeatable, poorly controlled, and reactive


Maturity Model Analysis

The CD Maturity Model is an analysis tool. In my experience, organizations use the maturity model in one of two ways. First, an organization completes an impartial evaluation of their existing levels of maturity across all areas of practice. Then, the organization focuses on improving the overall organization’s maturity, attempting to achieve a consistent level of maturity across all areas of practice. Alternately, the organization concentrates on a subset of the practices, which have the greatest business value, or given their relative immaturity, are a detriment to the other practices.


* CD Maturity Model Analysis Tool available on GitHub.

Infrastructure as Code Maturity Levels

Although infrastructure as code is not explicitly called out as a practice in the CD Maturity Model, many of it’s best practices can be found in the maturity model. For example, the model prescribes automated environment provisioning, orchestrated deployments, and the use of metrics for continuous improvement.

Instead of trying to retrofit infrastructure as code into the existing CD Maturity Model, I believe it is more effective to independently apply the model’s five levels of maturity to infrastructure as code. To that end, I have selected many of the best practices from the book, Infrastructure as Code, as well as from my experiences. Those selected practices have been distributed across the model’s five levels of maturity.

The result is the first pass at an evolving Infrastructure as Code Maturity Model. This model may be applied alongside the broader CD Maturity Model, or independently, to evaluate and further develop an organization’s infrastructure practices.

IaC Level -1: Regressive

Processes unrepeatable, poorly controlled, and reactive

  • Limited infrastructure is provisioned and managed as code
  • Infrastructure provisioning still requires many manual processes
  • Infrastructure code is not written using industry-standard tooling and patterns
  • Infrastructure code not built, unit-tested, provisioned and managed, as part of a pipeline
  • Infrastructure code, processes, and procedures are inconsistently documented, and not available to all required parties

IaC Level 0: Repeatable

Processes documented and partly automated

  • All infrastructure code and configuration are stored in a centralized version control system
  • Testing, provisioning, and management of infrastructure are done as part of automated pipeline
  • Infrastructure is deployable as individual components
  • Leverages programmatic interfaces into physical devices
  • Automated security inspection of components and dependencies
  • Self-service CLI or API, where internal customers provision their resources
  • All code, processes, and procedures documented and available
  • Immutable infrastructure and processes

IaC Level 1: Consistent

Automated processes applied across whole application lifecycle

  • Fully automated provisioning and management of infrastructure
  • Minimal use of unsupported, ‘home-grown’ infrastructure tooling
  • Unit-tests meet code-coverage requirements
  • Code is continuously tested upon every check-in to version control system
  • Continuously available infrastructure using zero-downtime provisioning
  • Uses configuration registries
  • Templatized configuration files (no awk/sed magic)
  • Secrets are securely management
  • Auto-scaling based on user-defined load characteristics

IaC Level 2: Quantitatively Managed

Processes measured and controlled

  • Uses infrastructure definition files
  • Capable of automated rollbacks
  • Infrastructure and supporting systems are highly available and fault tolerant
  • Externalized configuration, no black box API to modify configuration
  • Fully monitored infrastructure with configurable alerting
  • Aggregated, auditable infrastructure logging
  • All code, processes, and procedures are well documented in a Knowledge Management System
  • Infrastructure code uses declarative versus imperative programming model, maybe…

IaC Level 3: Optimizing

Focus on process improvement

  • Self-healing, self-configurable, self-optimizing, infrastructure
  • Performance tested and monitored against business KPIs
  • Maximal infrastructure utilization and workload density
  • Adheres to Cloud Native and 12-Factor patterns
  • Cloud-agnostic code that minimizes cloud vendor lock-in

All opinions in this post are my own and not necessarily the views of my current employer or their clients.

, , , , , , ,


Operational Readiness Analysis

“Analysis: a detailed examination of the elements or structure of something, typically as a basis for discussion or interpretation.” — Google


Recently, I had the opportunity, along with a colleague, to perform an operational readiness analysis of a client’s new application platform. The platform was in late-stage development, only a few weeks from going live. Being relatively new to the project team, our objective was to determine the amount of work remaining to make the platform operational, from a technical perspective. As well, we wanted to ensure there were no gaps in the project’s scope, as it related to technical operational readiness.

There were various approaches we could have taken to establish the amount of unfinished work and any existing gaps. We could have reviewed the state of the project in the client’s Agile project management system. We could have discussed the project’s state with the Product Owner, Project Manager, Team Lead, and Business Analyst. Neither of these methods on their own would have provided us with a complete picture.

The approach we ultimately took was what we’ve coined, an Operational Readiness Analysis, or our trendier title, the ‘State of the DevOps’. This approach is particularly effective and highly interactive. The analysis may be conducted at any stage of the Software Development Lifecycle (SDLC), to review a project’s operational state of readiness. The analysis involves five simple steps:

  1. Categorize
  2. Itemize
  3. Organize
  4. Prioritize
  5. Document

Grab Your Post-It Notes

We began, where all good ThoughtWorkers tend to begin, with a meeting room, whiteboard, dry-erase markers, and lots of brightly colored Post-It notes. ThoughtWorkers do love their Post-It notes. We gathered a small subset of the project team, including the Product Owner, Project Manager, and the team’s DevOps Engineer.

Step 1: Categorize

To start, we broadly categorized the operational requirements into buckets of work. In my experience, categorization typically takes one of two forms, either the use of role-based descriptors, like ‘Security’ and ‘Monitoring’, or the use of the ‘ilities’, such as ‘Scalability’ and ‘Maintainability’.

The ‘ilities’ are often referenced in regards to software architecture and the functional and nonfunctional requirements (NFRs) of a project. Although there are many ‘ilities’, project requirements frequently include Usability, Scalability, Maintainability, Reliability, Testability, Availability, and Serviceability. Although I enjoy referring to the ‘ilities’ when drafting project requirements, I find a broad audience more readily understands the high-level category descriptions.

We collectively agreed on eight categories, in addition to a catch-all ‘Other’ category, and distributed them on the whiteboard.


Role-based categories tend to include the following:

  • Security and Compliance
  • Monitoring and Alerting
  • Logging
  • Continuous Integration and Delivery
  • Release Management
  • Configuration
  • Environment Management
  • Infrastructure
  • Networking
  • Data Management
  • High Availability and Fault Tolerance
  • Performance
  • Backup and Restoration
  • Support and Maintenance
  • Documentation and Training
  • Technical Debt
  • Other

Step 2: Itemize

Next, each person placed as many Post-It notes as they wanted, below the categories on the whiteboard. Each Post-It represented an item someone felt needed to be addressed as part the analysis. Items might represent requirements currently in progress, or requirements still untouched in the project’s backlog. A Post-It note might contain a new requirement. Maybe there is a question about an aspect of the project’s relative importance, or there were concerns about how an aspect of the project had been implemented.

Participants placed an average of five to ten Post-It notes on the whiteboard. The end product was a collective mind-map of the project’s operational state (the following is only an example and doesn’t represent the actual results of any real client analysis).


Whereas categories tend to be generic in nature, the items are usually particular to the project, and it’s technology choices. The ‘Monitoring/Logging’ category might include individual items such as ‘Log Rotation Policy’ or ‘Splunk vs. ELK?’. The ‘Security’ category might include individual items such as ‘Need Password Rotation’ or ‘SSL Certificate Management’.

Participants often use one or two words to summarize more complex thoughts. For example, ‘Rollback’ might be stating, ‘we still don’t have a good rollback strategy for the database’. ‘Release Frequency’ may be questioning, ‘why can’t we release more frequently?’. Make sure to discuss and capture the participant’s full thoughts behind each Post-It, the during the Organize, Prioritize, and Document stages.

Don’t let participants give up too soon placing Post-It notes on the board. Often, one participant’s Post-It will spark additional thoughts other team members. Don’t be afraid to suggest a few items, if the group needs some initial motivation. Lastly, fear not, it’s never too late to go back and add missing items, uncovered in later stages of analysis.

Step 3: Organize

Quickly and non-judgmentally, review all items on the whiteboard. Group duplicates together. Ask for clarification on items where necessary. Move miscategorized items, if the author and team agree there is a more appropriate category.

Examine the ‘Other’ category. If there are items in the ‘Other’ category that are similar, consider adding a new category to capture them. Maybe the team missed identifying a key category, earlier. Conversely, there is nothing wrong with leaving a few stragglers in the ‘Other’ category; don’t overthink the exercise.

Participants should leave this stage with a general understanding of each item on the whiteboard.

Step 4: Prioritize

With all items identified, organized, and understood, discuss each item’s status. Does the item represent incomplete requirements? Does everyone agree an item’s requirements are complete? Team members may not agree on the completeness of particular items. Inconsistencies may reflect unclear requirements. More often, I find, disagreements are a result of an inconsistent or incomplete implementation of requirements across environments — Development, QA, Performance, Staging, and Production.

Why are operational requirements frequently implemented inconsistently? Often, a story is played early in the development stage, prior to all environments being available. For example, a story is played to add monitoring to Development and QA. The story is completed, tested, and closed. Afterward, the Performance environment is created. Later yet, the Staging and Production environments are created. But, there are no new stories to address monitoring the new environments.

One way to avoid this common issue with operational requirements is an effective DevOps automation strategy. In this example, an effective strategy would mean all new environments would get monitoring, automatically. The story should broadly address monitoring, and not, short-sightedly, monitoring of specific environments.

Often, only one or two project team members are responsible for ‘all the DevOps stuff’. They alone possess an accurate perspective on an item’s current status.

Gain agreement from the Product Owner, Project Manager, and key team members on the priority of incomplete requirements. Are they ‘Day One’ must-haves, or ‘Day Two’ and can be completed after the application goes live? Flag all high-priority items.

If the status of an item is unknown, such as ‘why is the QA environment always down?!’, note it as an open question and move on. If the item simply requires a quick answer to resolve, like ‘do we backup the database?’, answer it (‘the database is replicated and backed up daily’) and move on.


Step 5: Document

Snap a picture of the whiteboard and document its contents. We chose the client’s knowledge management system as a vehicle to share the whiteboard’s results. We created a table with columns for Category, Item, Description, Priority, Owner, and Notes. Also, it’s helpful to have a column for the project’s corresponding Story ID or Defect ID. Along with the whiteboard, capture key discussion points, questions, and concerns, raised by the participants.

Step 5: Document

Snap a picture of the whiteboard and document its contents. We chose Confluence to share the whiteboard results, using a table with columns for Category, Item, Description, Priority, Owner, and Notes. We also bulleted all key discussion points, questions, and concerns, raised by the participants.


Make the analysis results available to all team members. Let the team ask questions and poke holes in the results. Adjust the results if necessary.

Next Steps

The actions you take, given the results of the analysis, are up to you. In our case, we ensured all high-priority items were called out on the team’s Agile board. Any new items were captured in the client’s Agile project management system. Finally, we ensured open questions and concerns were addressed in a timely fashion. We continue to track each item’s status weekly, throughout the launch and post-launch periods.

We anticipate conducting a follow-up analysis, thirty days after launch. The goal will be to evaluate the effectiveness of the first analysis and identify additional operational needs as the application enters a business-as-usual (BAU) application lifecycle phase.

Postscript: the ‘ilities

The ‘ilities’, courtesy of and

  • Scalability
    The capability of a system, network, or process to handle a growing amount of work or its potential to be enlarged to accommodate that growth.
  • Testability
    The practical feasibility of observing a reproducible series of such counterexamples if they do exist.
  • Reliability (Resilience)
    The ability of a system or component to perform its required functions under stated conditions for a specified time.
  • Usability (Performance)
    The degree to which software can be used by specified consumers to achieve quantified objectives with effectiveness, efficiency, and satisfaction in a quantified context of use.
  • Serviceability (Supportability)
    The ability of technical support personnel to install, configure, and monitor computer products, identify exceptions or faults, debug or isolate faults to root cause analysis, and provide hardware or software maintenance in pursuit of solving a problem and restoring the product into service.
  • Availability
    The proportion of time a system is in a functioning condition. Defined in the service-level agreement (SLA).
  • Maintainability
    The ease with which a product can be maintained. Isolate and correct defects or their cause, prevent unexpected breakdowns, maximize a product’s useful life, maximize efficiency, reliability, and safety, meet new requirements, make future maintenance easier, and cope with a changed environment.

All opinions in this post are my own and not necessarily the views of my current employer or their clients.

, , , , ,

Leave a comment

Tips for Better of Software Releases



Although the process of releasing software varies considerably from application to application, we would probably all agree there are common reasons releases tend to fail. Lack of planning and preparation, defective code, infrastructure and release tooling failures, and poor communication, all transcend technology choices and industries.

Indeed, no one can guarantee a release’s success each and every time. However, there are steps we can take to improve the odds of executing consistently successful software releases. Based on my experience with a diverse set of enterprise software organizations, I’ve collected following suggestions for improving release success.

Manage the Release, Don’t Let the Release Manage You

Engage an experienced Release Manager, someone who schedules the release, coordinates the resources, runs the release call, leads the release team through the plan, and communicates status to stakeholders. The Release Manager serves as a single point of contact (aka SPOC) for the release. Having a Release manager allows other release team members to focus on their specific tasks without needless distraction.

A Goal Without a Plan is Just a Wish

Always have an implementation plan (aka IPlan). Whether a single page of paper, or a hundred-page Microsoft Word document, script the entire release in advance. What are the required tasks? What is their order of execution? What resources are required to complete each task? What is the test plan? What is the contingency plan, if or when something goes awry?

Don’t Boil the Ocean

Releases should have specific goals, the fewer, the better. Release goals should be clearly stated at the beginning of the implementation plan. If you feel you need to compress several goals into a single release, you are probably not releasing frequently enough.

For simplicity, separate software-centric releases from infrastructure-centric changes. Don’t tie the success of either, to the success or failure of the other.

Expect Success, Plan for Failure

Channel your inner Boy Scout and always be prepared. Have a contingency plan. Plan for multiple failure scenarios. Individual task failures only lead to release failures when we don’t have a plan to correct for the individual failures quickly. Are additional resources required in the event of a failure? If so, they should be available, and aware of the release plan and the contingency plan, in advance.

Seek the Understanding and Approval of Others

This is one of those times when seeking the understanding and the approval of others is a good thing. Review the plan in advance with all required resources and stakeholders. Ensure there is a complete comprehension of the plan, goals, resource requirements, and timing. Discuss the plan’s level of risk to the organization and their customers. Seek the support and approval of stakeholders when required.

The More I Practice the Luckier I Get

Do a dry run of the plan, even if it is just a verbal walkthrough with the release team. Better yet, do a live run in a production-like staging environment. Adjust the plan if necessary. Also, don’t forget to practice your contingency plan.

Should I Pack a Lunch?

Part of the plan′s review should be a discussion of timing. Release resources should know the estimated total time for the release (aka release window) — best case and not-so-best case. Also, if individual stages are expected to take more than a few minutes to complete, those times should be understood in advance. There is nothing worse than staring endlessly at the third stage of a ten stage continuous integration pipeline for 15 minutes, with no idea of how much long the task is expected to take.

Don’t Start the Game without the Whole Team on the Field

Never start a release without all the required resources present and prepared. A quick release can turn into a long and painful experience if you are waiting on resources. In my experience, the longer a release takes to complete, the greater the risk of failure.

Make a Plan and Stick to It

You wrote the plan, you reviewed the plan, you practiced the plan, and you received the approval of your stakeholders for the plan. So, follow the plan.

If you absolutely must deviate from the plan, take the time to consider the impact and potential risks. Document the deviation for the post-mortem and for planning the next release.

Test Early, Test Often

Testing should not be the final step to confirm the release’s success. Testing should be done continuously and automatically, throughout the release process. Test after each significant stage of the plan. It’s easier to find and correct issues the early they are discovered.

Mute is Your Friend, Always be on Mute

Emotions will flare, words will uncontrollably leap from your lips on occasion, background noise makes following the release call difficult to follow, and diagnosing release issues isn’t best done in front of an audience. Keep the release call focused on the plan and free of emotion.

Are We There Yet?

Consistently communicate before, during, and after the release. Let stakeholders know in advance when the release is starting and how long it is expected to take. Keep stakeholders aware of significant deviations to the plan, especially with customer-facing impact. Let everyone know when the release finishes, and a final release status. Keep communications germane.

Ensure sure everyone understands how the release status will be communicated, be it email, IM, persistent chat, or a web-based status page.

Mistakes are Meant for Learning, not Repeating

Successful or otherwise, follow each release with a blameless post-mortem. A post-mortem might only require a quick five-minute chat. Or, the whole release team might need an hour with a therapist (that’s a joke). Discuss what went right, and what did not go as well as expected. Be keen to focus on repeat problems and problems that were not caught during the release. Most importantly, consider how to continuously improve the release process.

Releasing is a Game, Keep Score

Know your release’s Key Performance Indicators (KPIs) and your Service Level Agreements (SLAs). Understand what matters most to your stakeholders and to your customers. Always know how you are performing against your KPIs and SLAs, and what you need to do to improve them. Dashboards are great tools to display KPI and SLA performance transparently.

Is your failure rate increasing or decreasing? Is one type of task responsible for a majority of your release failures? Is your release taking longer or shorter to complete each time? Did you experience an unexpected outage during the release? For how long? What is the volume of post-release issues caused by, but not discovered during the release?

All opinions in this post are my own and not necessarily the views of my current employer or their clients.

, , , , , , ,

Leave a comment

Spring Music Revisited: Java-Spring-MongoDB Web App with Docker 1.12

Build, test, deploy, and monitor a multi-container, MongoDB-backed, Java Spring web application, using the new Docker 1.12.

Spring Music Infrastructure


** This post and associated project code were updated 9/3/2016 to use Tomcat 8.5.4 with OpenJDK 8.**

This post and the post’s example project represent an update to a previous post, Build and Deploy a Java-Spring-MongoDB Application using Docker. This new post incorporates many improvements made in Docker 1.12, including the use of the new Docker Compose v2 YAML format. The post’s project was also updated to use Filebeat with ELK, as opposed to Logspout, which was used previously.

In this post, we will demonstrate how to build, test, deploy, and manage a Java Spring web application, hosted on Apache Tomcat, load-balanced by NGINX, monitored by ELK with Filebeat, and all containerized with Docker.

We will use a sample Java Spring application, Spring Music, available on GitHub from Cloud Foundry. The Spring Music sample record album collection application was originally designed to demonstrate the use of database services on Cloud Foundry, using the Spring Framework. Instead of Cloud Foundry, we will host the Spring Music application locally, using Docker on VirtualBox, and optionally on AWS.

All files necessary to build this project are stored on the docker_v2 branch of the garystafford/spring-music-docker repository on GitHub. The Spring Music source code is stored on the springmusic_v2 branch of the garystafford/spring-music repository, also on GitHub.

Spring Music Application

Application Architecture

The Java Spring Music application stack contains the following technologies: JavaSpring Framework, AngularJS, Bootstrap, jQueryNGINXApache TomcatMongoDB, the ELK Stack, and Filebeat. Testing frameworks include the Spring MVC Test Framework, Mockito, Hamcrest, and JUnit.

A few changes were made to the original Spring Music application to make it work for this demonstration, including:

  • Move from Java 1.7 to 1.8 (including newer Tomcat version)
  • Add unit tests for Continuous Integration demonstration purposes
  • Modify MongoDB configuration class to work with non-local, containerized MongoDB instances
  • Add Gradle warNoStatic task to build WAR without static assets
  • Add Gradle zipStatic task to ZIP up the application’s static assets for deployment to NGINX
  • Add Gradle zipGetVersion task with a versioning scheme for build artifacts
  • Add context.xml file and MANIFEST.MF file to the WAR file
  • Add Log4j RollingFileAppender appender to send log entries to Filebeat
  • Update versions of several dependencies, including Gradle, Spring, and Tomcat

We will use the following technologies to build, publish, deploy, and host the Java Spring Music application: GradlegitGitHubTravis CIOracle VirtualBoxDockerDocker ComposeDocker MachineDocker Hub, and optionally, Amazon Web Services (AWS).

To increase performance, the Spring Music web application’s static content will be hosted by NGINX. The application’s WAR file will be hosted by Apache Tomcat 8.5.4. Requests for non-static content will be proxied through NGINX on the front-end, to a set of three load-balanced Tomcat instances on the back-end. To further increase application performance, NGINX will also be configured for browser caching of the static content. In many enterprise environments, the use of a Java EE application server, like Tomcat, is still not uncommon.

Reverse proxying and caching are configured thought NGINX’s default.conf file, in the server configuration section:

The three Tomcat instances will be manually configured for load-balancing using NGINX’s default round-robin load-balancing algorithm. This is configured through the default.conf file, in the upstream configuration section:

Client requests are received through port 80 on the NGINX server. NGINX redirects requests, which are not for non-static assets, to one of the three Tomcat instances on port 8080.

The Spring Music application was designed to work with a number of data stores, including MySQL, Postgres, Oracle, MongoDB, Redis, and H2, an in-memory Java SQL database. Given the choice of both SQL and NoSQL databases, we will select MongoDB.

The Spring Music application, hosted by Tomcat, will store and modify record album data in a single instance of MongoDB. MongoDB will be populated with a collection of album data from a JSON file, when the Spring Music application first creates the MongoDB database instance.

Lastly, the ELK Stack with Filebeat, will aggregate NGINX, Tomcat, and Java Log4j log entries, providing debugging and analytics to our demonstration. A similar method for aggregating logs, using Logspout instead of Filebeat, can be found in this previous post.

Kibana 4 Web Console

Continuous Integration

In this post’s example, two build artifacts, a WAR file for the application and ZIP file for the static web content, are built automatically by Travis CI, whenever source code changes are pushed to the springmusic_v2 branch of the garystafford/spring-music repository on GitHub.

Travis CI Output

Following a successful build and a small number of unit tests, Travis CI pushes the build artifacts to the build-artifacts branch on the same GitHub project. The build-artifacts branch acts as a pseudo binary repository for the project, much like JFrog’s Artifactory. These artifacts are used later by Docker to build the project’s immutable Docker images and containers.

Build Artifact Repository

Build Notifications
Travis CI pushes build notifications to a Slack channel, which eliminates the need to actively monitor Travis CI.

Travis CI Slack Notifications

Automation Scripting
The .travis.yaml file, custom Gradle tasks, and the script handles the Travis CI automation described, above.

Travis CI .travis.yaml file:

Custom tasks:

The file:

You can easily replicate the project’s continuous integration automation using your choice of toolchains. GitHub or BitBucket are good choices for distributed version control. For continuous integration and deployment, I recommend Travis CI, Semaphore, Codeship, or Jenkins. Couple those with a good persistent chat application, such as Glider Labs’ Slack or Atlassian’s HipChat.

Building the Docker Environment

Make sure VirtualBox, Docker, Docker Compose, and Docker Machine, are installed and running. At the time of this post, I have the following versions of software installed on my Mac:

  • Mac OS X 10.11.6
  • VirtualBox 5.0.26
  • Docker 1.12.1
  • Docker Compose 1.8.0
  • Docker Machine 0.8.1

To build the project’s VirtualBox VM, Docker images, and Docker containers, execute the build script, using the following command: sh ./ A build script is useful when working with CI/CD automation tools, such as Jenkins CI or ThoughtWorks go. However, to understand the build process, I suggest first running the individual commands, locally.

Deploying to AWS
By simply changing the Docker Machine driver to AWS EC2 from VirtualBox, and providing your AWS credentials, the springmusic environment may also be built on AWS.

Build Process
Docker Machine provisions a single VirtualBox springmusic VM on which host the project’s containers. VirtualBox provides a quick and easy solution that can be run locally for initial development and testing of the application.

Next, the script creates a Docker data volume and project-specific Docker bridge network.

Next, using the project’s individual Dockerfiles, Docker Compose pulls base Docker images from Docker Hub for NGINX, Tomcat, ELK, and MongoDB. Project-specific immutable Docker images are then built for NGINX, Tomcat, and MongoDB. While constructing the project-specific Docker images for NGINX and Tomcat, the latest Spring Music build artifacts are pulled and installed into the corresponding Docker images.

Docker Compose builds and deploys (6) containers onto the VirtualBox VM: (1) NGINX, (3) Tomcat, (1) MongoDB, and (1) ELK.

The NGINX Dockerfile:

The Tomcat Dockerfile:

Docker Compose v2 YAML
This post was recently updated for Docker 1.12, and to use Docker Compose v2 YAML file format. The post’s docker-compose.yml takes advantage of improvements in Docker 1.12 and Docker Compose v2 YAML. Improvements to the YAML file include eliminating the need to link containers and expose ports, and the addition of named networks and volumes.

The Results

Spring Music Infrastructure

Below are the results of building the project.

Testing the Application

Below are partial results of the curl test, hitting the NGINX endpoint. Note the different IP addresses in the Upstream-Address field between requests. This test proves NGINX’s round-robin load-balancing is working across the three Tomcat application instances: music_app_1, music_app_2, and music_app_3.

Also, note the sharp decrease in the Request-Time between the first three requests and subsequent three requests. The Upstream-Response-Time to the Tomcat instances doesn’t change, yet the total Request-Time is much shorter, due to caching of the application’s static assets by NGINX.

Spring Music Application Links

Assuming the springmusic VM is running at, the following links can be used to access various project endpoints. Note the (3) Tomcat instances each map to randomly exposed ports. These ports are not required by NGINX, which maps to port 8080 for each instance. The port is only required if you want access to the Tomcat Web Console. The port, shown below, 32771, is merely used as an example.

* The Tomcat user name is admin and the password is t0mcat53rv3r.

Helpful Links


  • Automate the Docker image build and publish processes
  • Automate the Docker container build and deploy processes
  • Automate post-deployment verification testing of project infrastructure
  • Add Docker Swarm multi-host capabilities with overlay networking
  • Update Spring Music with latest CF project revisions
  • Include scripting example to stand-up project on AWS
  • Add Consul and Consul Template for NGINX configuration

, , , , , , , , , , , ,


Scaffold a RESTful API with Yeoman, Node, Restify, and MongoDB

Using Yeoman, scaffold a basic RESTful CRUD API service, based on Node, Restify, and MongoDB.



NOTE: Generator updated on 11-13-2016 to v0.2.1.

Yeoman generators reduce the repetitive coding of boilerplate functionality and ensure consistency between full-stack JavaScript projects. For several recent Node.js projects, I created the generator-node-restify-mongodb Yeoman generator. This Yeoman generator scaffolds a basic RESTful CRUD API service, a Node application, based on Node.js, Restify, and MongoDB.

According to their website, Restify, used most notably by Netflix, borrows heavily from Express. However, while Express is targeted at browser applications, with templating and rendering, Restify is keenly focused on building API services that are maintainable and observable.

Along with Node, Restify, and MongoDB, theNode application’s scaffolded by the Node-Restify-MongoDB Generator, also implements Bunyan, which includes DTrace, Jasmine, using jasmine-nodeMongoose, and Grunt.

Portions of the scaffolded Node application’s file structure and code are derived from what I consider the best parts of several different projects, including generator-express, generator-restify-mongo, and generator-restify.


To begin, install Yeoman and the generator-node-restify-mongodb using npm. The generator assumes you have pre-installed Node and MongoDB.

npm install -g yo
npm install -g generator-node-restify-mongodb

Then, generate the new project.

mkdir node-restify-mongodb
cd $_
yo node-restify-mongodb

Yeoman scaffolds the application, creating the directory structure, copying required files, and running ‘npm install’ to load the npm package dependencies.



Using the Generated Application

Next, import the supplied set of sample widget documents into the local development instance of MongoDB from the supplied ‘data/widgets.json’ file.

NODE_ENV=development grunt mongoimport --verbose

Similar to Yeoman’s Express Generator, this application contains configuration for three typical environments: ‘Development’ (default), ‘Test’, and ‘Production’. If you want to import the sample widget documents into your Test or Production instances of MongoDB, first change the ‘NODE_ENV’ environment variable value.

NODE_ENV=production grunt mongoimport --verbose

To start the application in a new terminal window, use the following command.

npm start

The output should be similar to the example, below.


To test the application, using jshint and the jasmine-node module, the sample documents must be imported into MongoDB and the application must be running (see above). To test the application, open a separate terminal window, and use the following command.

npm test

The project contains a set of jasmine-node tests, split between the ‘/widgets’ and the ‘/utils’ endpoints. If the application is running correctly, you should see the following output from the tests.


Similarly, the following command displays a code coverage report, using the grunt, mocha, istanbul, and grunt-mocha-istanbul node modules.

grunt coverage

Grunt uses the grunt-mocha-istanbul module to execute the same set of jasmine-node tests as shown above. Based on those tests, the application’s code coverage (statement, line, function, and branch coverage) is displayed.


You may test the running application, directly, by cURLing the ‘/widgets’ endpoints.

curl -X GET -H "Accept: application/json" "http://localhost:3000/widgets"

For more legible output, try prettyjson.

npm install -g prettyjson
curl -X GET -H "Accept: application/json" "http://localhost:3000/widgets" --silent | prettyjson
curl -X GET -H "Accept: application/json" "http://localhost:3000/widgets/SVHXPAWEOD" --silent | prettyjson

The JSON-formatted response body from the HTTP GET requests should look similar to the output, below.


A much better RESTful API testing solution is Postman. Postman provides the ability to individually configure each environment and abstract that environment-specific configuration, such as host and port, from the actual HTTP requests.


Continuous Integration

As part of being published to both the npmjs and Yeoman registries, the generator-node-restify-mongodb generator is continuously integrated on Travis CI. This should provide an addition level of confidence to the generator’s end-users. Currently, Travis CI tests the generator against Node.js v4, v5, and v6, as well as IO.js. Older versions of Node.js may have compatibility issues with the application.


Additionally, Travis CI feeds test results to Coveralls, which displays the generator’s code coverage. Note the code coverage, shown below, is reported for the yeoman generator, not the generator’s scaffolded application. The scaffolded application’s coverage is shown above.


Application Details

API Endpoints

The scaffolded application includes the following endpoints.

# widget resources
var PATH = '/widgets';
server.get({path: PATH, version: VERSION}, findDocuments);
server.get({path: PATH + '/:product_id', version: VERSION}, findOneDocument);{path: PATH, version: VERSION}, createDocument);
server.put({path: PATH, version: VERSION}, updateDocument);
server.del({path: PATH + '/:product_id', version: VERSION}, deleteDocument);

# utility resources
var PATH = '/utils';
server.get({path: PATH + '/ping', version: VERSION}, ping);
server.get({path: PATH + '/health', version: VERSION}, health);
server.get({path: PATH + '/info', version: VERSION}, information);
server.get({path: PATH + '/config', version: VERSION}, configuraton);
server.get({path: PATH + '/env', version: VERSION}, environment);

The Widget

The Widget is the basic document object used throughout the application. It is used, primarily, to demonstrate Mongoose’s Model and Schema. The Widget object contains the following fields, as shown in the sample widget, below.

  "product_id": "4OZNPBMIDR",
  "name": "Fapster",
  "color": "Orange",
  "size": "Medium",
  "price": "29.99",
  "inventory": 5


Use the mongo shell to access the application’s MongoDB instance and display the imported sample documents.

 > show dbs
 > use node-restify-mongodb-development
 > show tables
 > db.widgets.find()

The imported sample documents should be displayed, as shown below.


Environmental Variables

The scaffolded application relies on several environment variables to determine its environment-specific runtime configuration. If these environment variables are present, the application defaults to using the Development environment values, as shown below, in the application’s ‘config/config.js’ file.

var NODE_ENV   = process.env.NODE_ENV   || 'development';
var NODE_HOST  = process.env.NODE_HOST  || '';
var NODE_PORT  = process.env.NODE_PORT  || 3000;
var MONGO_HOST = process.env.MONGO_HOST || '';
var MONGO_PORT = process.env.MONGO_PORT || 27017;
var LOG_LEVEL  = process.env.LOG_LEVEL  || 'info';
var APP_NAME   = 'node-restify-mongodb-';

Future Project TODOs

Future project enhancements include the following:

  • Add filtering, sorting, field selection and paging
  • Add basic HATEOAS-based response features
  • Add authentication and authorization to production MongoDB instance
  • Convert from out-dated jasmine-node to Jasmine?

, , , , , , , , , , , , , ,

1 Comment

Using Weave to Network a Docker Multi-Container Java Application

Use the latest version of Weaveworks’ Weave Net to network a multi-container, Dockerized Java Spring web application.

Introduction Weave Image


The last post demonstrated how to build and deploy the Java Spring Music application to a VirtualBox, multi-container test environment. The environment contained (1) NGINX container, (2) load-balanced Tomcat containers, (1) MongoDB container, (1) ELK Stack container, and (1) Logspout container, all on one VM.

Spring Music

In that post, we used Docker’s links option. The links options, which modifies the container’s /etc/hosts file, allows two Docker containers to communicate with each other. For example, the NGINX container is linked to both Tomcat containers:

  build: nginx/
  ports: "80:80"
   - app01
   - app02

Although container linking works, links are not very practical beyond a small number of static containers or a single container host. With linking, you must explicitly define each service-to-container relationship you want Docker to configure. Linking is not an option with Docker Swarm to link containers across multiple virtual machine container hosts. With Docker Networking in its early ‘experimental’ stages and the Swarm limitation, it’s hard to foresee the use of linking for any uses beyond limited development and test environments.

Weave Net

Weave Net, aka Weave, is one of a trio of products developed by Weaveworks. The other two members of the trio include Weave Run and Weave Scope. According to Weaveworks’ website, ‘Weave Net connects all your containers into a transparent, dynamic and resilient mesh. This is one of the easiest ways to set up clustered applications that run anywhere.‘ Weave allows us to eliminate the dependency on the links connect our containers. Weave does all the linking of containers for us automatically.

Weave v1.1.0

If you worked with previous editions of Weave, you will appreciate that Weave versions v1.0.x and v1.1.0 are significant steps forward in the evolution of Weave. Weaveworks’ GitHub Weave Release page details the many improvements. I also suggest reading Weave ‘Gossip’ DNS, on Weavework’s blog, before continuing. The post details the improvements of Weave v1.1.0. Some of those key new features include:

  • Completely redesigned weaveDNS, dubbed ‘Gossip DNS’
  • Registrations are broadcast to all weaveDNS instances
  • Registered entries are stored in-memory and handle lookups locally
  • Weave router’s gossip implementation periodically synchronizes DNS mappings between peers
  • Ability to recover from network partitions and other transient failures
  • Each peer is aware of the hostnames and IP address of all containers in the Weave network.
  • weave launch now launches all weave components, including the router, weaveDNS and the proxy, greatly simplifying setup
  • weaveDNS is now embedded in the Weave router

Weave-based Network

In this post, we will reuse the Java Spring Music application from the last post. However, we will replace the project’s static dependencies on Docker links with Weave. This post will demonstrate the most basic features of Weave, using a single cluster. In a future post, we will demonstrate how easily Weave also integrates with multiple clusters.

All files for this post can be found in the swarm-weave branch of the GitHub Repository. Instructions to clone are below.


If you recall from the previous post, the Docker Compose YAML file (docker-compose.yml) looked similar to this:

  build: nginx/
  ports: "80:80"
   - app01
   - app02
  hostname: "proxy"

  build: tomcat/
  expose: "8080"
  ports: "8180:8080"
   - nosqldb
   - elk
  hostname: "app01"

  build: tomcat/
  expose: "8080"
  ports: "8280:8080"
   - nosqldb
   - elk
  hostname: "app01"

  build: mongo/
  hostname: "nosqldb"
  volumes: "/opt/mongodb:/data/db"

  build: elk/
   - "8081:80"
   - "8082:9200"
  expose: "5000/upd"

  build: logspout/
  volumes: "/var/run/docker.sock:/tmp/docker.sock"
  links: elk
  ports: "8083:80"
  environment: ROUTE_URIS=logstash://elk:5000

Implementing Weave simplifies the docker-compose.yml, considerably. Below is the new Weave version of the docker-compose.yml. The links option have been removed from all containers. Additionally, the hostnames have been removed, as they serve no real purpose moving forward. The logspout service’s environment option has been modified to use the elk container’s full name as opposed to the hostname.

The only addition is the volumes_from option to the proxy service. We must ensure that the two Tomcat containers start before the NGINX containers. The links option indirectly provided this functionality, previously.

  build: nginx/
   - "80:80"
   - app01
   - app02

  build: tomcat/
   - "8080"
   - "8180:8080"

  build: tomcat/
   - "8080"
   - "8280:8080"

  build: mongo/
   - "/opt/mongodb:/data/db"

  build: elk/
   - "8081:80"
   - "8082:9200"
   - "5000/upd"

  build: logspout/
   - "/var/run/docker.sock:/tmp/docker.sock"
   - "8083:80"
    - ROUTE_URIS=logstash://music_elk_1:5000

Next, we need to modify the NGINX configuration, slightly. In the previous post we referenced the Tomcat service names, as shown below.

upstream backend {
  server app01:8080;
  server app02:8080;

Weave will automatically add the two Tomcat container names to the NGINX container’s /etc/hosts file. We will add these Tomcat container names to NGINX’s configuration file.

upstream backend {
  server music_app01_1:8080;
  server music_app02_1:8080;

In an actual Production environment, we would use a template, along with a service discovery tool, such as Consul, to automatically populate the container names, as containers are dynamically created or destroyed.

Installing and Running Weave

After cloning this post’s GitHub repository, I recommend first installing and configuring Weave. Next, build the container host VM using Docker Machine. Lastly, build the containers using Docker Compose. The script below will take care of all the necessary steps.


# title:          Build Complete Project
# author:         Gary A. Stafford (
# url:    
# description:    Clone and build complete Spring Music Docker project
# to run:         sh ./

# install latest weave
curl -L -o /usr/local/bin/weave && 
chmod a+x /usr/local/bin/weave && 
weave version

# clone project
git clone -b swarm-weave \
  --single-branch --branch swarm-weave \ && 
cd spring-music-docker

# build VM
docker-machine create --driver virtualbox springmusic --debug

# create diectory to store mongo data on host
docker ssh springmusic mkdir /opt/mongodb

# set new environment
docker-machine env springmusic && 
eval "$(docker-machine env springmusic)"

# launch weave and weaveproxy/weaveDNS containers
weave launch &&
tlsargs=$(docker-machine ssh springmusic \
  "cat /proc/\$(pgrep /usr/local/bin/docker)/cmdline | tr '\0' '\n' | grep ^--tls | tr '\n' ' '")
weave launch-proxy $tlsargs &&
eval "$(weave env)" &&

# test/confirm weave status
weave status &&
docker logs weaveproxy

# pull and build images and containers
# this step will take several minutes to pull images first time
docker-compose -f docker-compose.yml -p music up -d

# wait for container apps to fully start
sleep 15

# test weave (should list entries for all containers)
docker exec -it music_proxy_1 cat /etc/hosts 

# run quick test of Spring Music application
for i in {1..10}
  curl -I --url $(docker-machine ip springmusic)

One last test, to ensure that MongoDB is using the host’s volume, and not storing data in the MongoDB container’s /data/db directory, execute the following command: docker-machine ssh springmusic ls -Alh /opt/mongodb. You should see MongoDB-related content being stored here.

Testing Weave

Running the weave status command, we should observe that Weave returned a status similar to the example below:

gstafford@gstafford-X555LA:$ weave status

       Version: v1.1.0

       Service: router
      Protocol: weave 1..2
          Name: 6a:69:11:1b:b4:e3(springmusic)
    Encryption: disabled
 PeerDiscovery: enabled
       Targets: 0
   Connections: 0
         Peers: 1

       Service: ipam
     Consensus: achieved
         Range: [

       Service: dns
        Domain: weave.local.
           TTL: 1
       Entries: 2

       Service: proxy
       Address: tcp://

Running the docker exec -it music_proxy_1 cat /etc/hosts command, we should observe that WeaveDNS has automatically added entries for all containers to the music_proxy_1 container’s /etc/hosts file. WeaveDNS will also remove the addresses of any containers that die. This offers a simple way to implement redundancy.

gstafford@gstafford-X555LA:$ docker exec -it music_proxy_1 cat /etc/hosts

# modified by weave       music_proxy_1       localhost    weave weave.bridge    music_elk_1 music_elk_1.bridge    music_nosqldb_1 music_nosqldb_1.bridge    music_app02_1 music_app02_1.bridge    music_logspout_1 music_logspout_1.bridge    music_app01_1 music_app01_1.bridge

::1             ip6-localhost ip6-loopback localhost
fe00::0         ip6-localnet
ff00::0         ip6-mcastprefix
ff02::1         ip6-allnodes
ff02::2         ip6-allrouters

Weave resolves the container’s name to eth0 IP address, created by Docker’s docker0 Ethernet bridge. Each container can now communicate with all other containers in the cluster.

Weave eth0 Network


Resulting virtual machines, network, images, and containers:

gstafford@gstafford-X555LA:$ docker-machine ls
NAME            ACTIVE   DRIVER       STATE     URL                         SWARM
springmusic     *        virtualbox   Running   tcp://   

gstafford@gstafford-X555LA:$ docker images
REPOSITORY             TAG                 IMAGE ID            CREATED             VIRTUAL SIZE
music_app02            latest              632c782010ac        3 days ago          370.4 MB
music_app01            latest              632c782010ac        3 days ago          370.4 MB
music_proxy            latest              171624a31920        3 days ago          144.5 MB
music_nosqldb          latest              2b3b46af5ef3        3 days ago          260.8 MB
music_elk              latest              5c18dae84b26        3 days ago          1.05 GB
weaveworks/weaveexec   v1.1.0              69c6bfa7934f        5 days ago          58.18 MB
weaveworks/weave       v1.1.0              5dccf0533147        5 days ago          17.53 MB
music_logspout         latest              fe64597ab0c4        8 days ago          24.36 MB
gliderlabs/logspout    master              40a52d6ca462        9 days ago          14.75 MB
willdurand/elk         latest              04cd7334eb5d        2 weeks ago         1.05 GB
tomcat                 latest              6fe1972e6b08        2 weeks ago         347.7 MB
mongo                  latest              5c9464760d54        2 weeks ago         260.8 MB
nginx                  latest              cd3cf76a61ee        2 weeks ago         132.9 MB

gstafford@gstafford-X555LA:$ weave ps
weave:expose 6a:69:11:1b:b4:e3
2bce66e3b33b fa:07:7e:85:37:1b
604dbbc4473f 6a:73:8d:54:cc:fe
ea64b42cf5a1 c2:69:73:84:67:69
85b1e8a9b8d0 aa:f7:12:cd:b7:13
81041fc97d1f 2e:1e:82:67:89:5d
e80c04bdbfaf 1e:95:a5:b2:9d:30
18c22e7f1c33 7e:43:54:db:8d:b8

gstafford@gstafford-X555LA:$ docker ps -a
CONTAINER ID        IMAGE                         COMMAND                  CREATED             STATUS              PORTS                                                                                            NAMES
2bce66e3b33b        music_app01                   "/w/w run"   3 days ago          Up 3 days >8080/tcp                                                                           music_app01_1
604dbbc4473f        music_logspout                "/w/w /bin/logspout"     3 days ago          Up 3 days           8000/tcp,>80/tcp                                                                   music_logspout_1
ea64b42cf5a1        music_app02                   "/w/w run"   3 days ago          Up 3 days >8080/tcp                                                                           music_app02_1
85b1e8a9b8d0        music_proxy                   "/w/w nginx -g 'daemo"   3 days ago          Up 3 days >80/tcp, 443/tcp                                                                      music_proxy_1
81041fc97d1f        music_nosqldb                 "/w/w / "   3 days ago          Up 3 days           27017/tcp                                                                                        music_nosqldb_1
e80c04bdbfaf        music_elk                     "/w/w /usr/bin/superv"   3 days ago          Up 3 days           5000/0,>80/tcp,>9200/tcp                                             music_elk_1
8eafc6225fc1        weaveworks/weaveexec:v1.1.0   "/home/weave/weavepro"   3 days ago          Up 3 days                                                                                                            weaveproxy
18c22e7f1c33        weaveworks/weave:v1.1.0       "/home/weave/weaver -"   3 days ago          Up 3 days >53/udp,>6783/tcp,>6783/udp,>53/tcp   weave

Spring Music Application Links

Assuming springmusic VM is running at, these are the accessible URL for each of the environment’s major components:

* The Tomcat user name is admin and the password is t0mcat53rv3r.

Helpful Links

, , , , , , , , ,

Leave a comment