Introduction
Modern DevOps tools, such as HashiCorp’s Packer and Terraform, make it easier to provision and manage complex cloud architecture. Utilizing a CI/CD server, such as Jenkins, to securely automate the use of these DevOps tools, ensures quick and consistent results.
In a recent post, Distributed Service Configuration with Consul, Spring Cloud, and Docker, we built a Consul cluster using Docker swarm mode, to host distributed configurations for a Spring Boot application. The cluster was built locally with VirtualBox. This architecture is fine for development and testing, but not for use in Production.
In this post, we will deploy a highly available three-node Consul cluster to AWS. We will use Terraform to provision a set of EC2 instances and accompanying infrastructure. The instances will be built from a hybrid AMIs containing the new Docker Community Edition (CE). In a recent post, Baking AWS AMI with new Docker CE Using Packer, we provisioned an Ubuntu AMI with Docker CE, using Packer. We will deploy Docker containers to each EC2 host, containing an instance of Consul server.
All source code can be found on GitHub.
Jenkins
I have chosen Jenkins to automate all of the post’s build, provisioning, and deployment tasks. However, none of the code is written specifically to Jenkins; you may run all of it from the command line.
For this post, I have built four projects in Jenkins, as follows:
- Provision Docker CE AMI: Builds Ubuntu AMI with Docker CE, using Packer
- Provision Consul Infra AWS: Provisions Consul infrastructure on AWS, using Terraform
- Deploy Consul Cluster AWS: Deploys Consul to AWS, using Docker
- Destroy Consul Infra AWS: Destroys Consul infrastructure on AWS, using Terraform
We will primarily be using the ‘Provision Consul Infra AWS’, ‘Deploy Consul Cluster AWS’, and ‘Destroy Consul Infra AWS’ Jenkins projects in this post. The fourth Jenkins project, ‘Provision Docker CE AMI’, automates the steps found in the recent post, Baking AWS AMI with new Docker CE Using Packer, to build the AMI used to provision the EC2 instances in this post.
Terraform
Using Terraform, we will provision EC2 instances in three different Availability Zones within the US East 1 (N. Virginia) Region. Using Terraform’s Amazon Web Services (AWS) provider, we will create the following AWS resources:
- (1) Virtual Private Cloud (VPC)
- (1) Internet Gateway
- (1) Key Pair
- (3) Elastic Cloud Compute (EC2) Instances
- (2) Security Groups
- (3) Subnets
- (1) Route
- (3) Route Tables
- (3) Route Table Associations
The final AWS architecture should resemble the following:
Production Ready AWS
Although we have provisioned a fairly complete VPC for this post, it is far from being ready for Production. I have created two security groups, limiting the ingress and egress to the cluster. However, to further productionize the environment would require additional security hardening. At a minimum, you should consider adding public/private subnets, NAT gateways, network access control list rules (network ACLs), and the use of HTTPS for secure communications.
In production, applications would communicate with Consul through local Consul clients. Consul clients would take part in the LAN gossip pool from different subnets, Availability Zones, Regions, or VPCs using VPC peering. Communications would be tightly controlled by IAM, VPC, subnet, IP address, and port.
Also, you would not have direct access to the Consul UI through a publicly exposed IP or DNS address. Access to the UI would be removed altogether or locked down to specific IP addresses, and accessed restricted to secure communication channels.
Consul
We will achieve high availability (HA) by clustering three Consul server nodes across the three Elastic Cloud Compute (EC2) instances. In this minimally sized, three-node cluster of Consul servers, we are protected from the loss of one Consul server node, one EC2 instance, or one Availability Zone(AZ). The cluster will still maintain a quorum of two nodes. An additional level of HA that Consul supports, multiple datacenters (multiple AWS Regions), is not demonstrated in this post.
Docker
Having Docker CE already installed on each EC2 instance allows us to execute remote Docker commands over SSH from Jenkins. These commands will deploy and configure a Consul server node, within a Docker container, on each EC2 instance. The containers are built from HashiCorp’s latest Consul Docker image pulled from Docker Hub.
Getting Started
Preliminary Steps
If you have built infrastructure on AWS with Terraform, these steps should be familiar to you:
- First, you will need an AMI with Docker. I suggest reading Baking AWS AMI with new Docker CE Using Packer.
- You will need an AWS IAM User with the proper access to create the required infrastructure. For this post, I created a separate Jenkins IAM User with PowerUser level access.
- You will need to have an RSA public-private key pair, which can be used to SSH into the EC2 instances and install Consul.
- Ensure you have your AWS credentials set. I usually source mine from a
.env
file, as environment variables. Jenkins can securely manage credentials, using secret text or files. - Fork and/or clone the Consul cluster project from GitHub.
- Change the
aws_key_name
andpublic_key_path
variable values to your own RSA key, in thevariables.tf
file - Change the
aws_amis_base
variable values to your own AMI ID (see step 1) - If you are do not want to use the US East 1 Region and its AZs, modify the
variables.tf
,network.tf
, andinstances.tf
files. - Disable Terraform’s remote state or modify the resource to match your remote state configuration, in the
main.tf file
. I am using an Amazon S3 bucket to store my Terraform remote state.
Building an AMI with Docker
If you have not built an Amazon Machine Image (AMI) for use in this post already, you can do so using the scripts provided in the previous post’s GitHub repository. To automate the AMI build task, I built the ‘Provision Docker CE AMI’ Jenkins project. Identical to the other three Jenkins projects in this post, this project has three main tasks, which include: 1) SCM: clone the Packer AMI GitHub project, 2) Bindings: set up the AWS credentials, and 3) Build: run Packer.
The SCM and Bindings tasks are identical to the other projects (see below for details), except for the use of a different GitHub repository. The project’s Build step, which runs the packer_build_ami.sh
script looks as follows:
The resulting AMI ID will need to be manually placed in Terraform’s variables.tf
file, before provisioning the AWS infrastructure with Terraform. The new AMI ID will be displayed in Jenkin’s build output.
Provisioning with Terraform
Based on the modifications you made in the Preliminary Steps, execute the terraform validate
command to confirm your changes. Then, run the terraform plan
command to review the plan. Assuming are were no errors, finally, run the terraform apply
command to provision the AWS infrastructure components.
In Jenkins, I have created the ‘Provision Consul Infra AWS’ project. This project has three tasks, which include: 1) SCM: clone the GitHub project, 2) Bindings: set up the AWS credentials, and 3) Build: run Terraform. Those tasks look as follows:
You will obviously need to use your modified GitHub project, incorporating the configuration changes detailed above, as the SCM source for Jenkins.
You will also need to configure your AWS credentials.
The provision_infra.sh
script provisions the AWS infrastructure using Terraform. The script also updates Terraform’s remote state. Remember to update the remote state configuration in the script to match your personal settings.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cd tf_env_aws/ | |
terraform remote config \ | |
-backend=s3 \ | |
-backend-config="bucket=your_bucket" \ | |
-backend-config="key=terraform_consul.tfstate" \ | |
-backend-config="region=your_region" | |
terraform plan | |
terraform apply |
The Jenkins build output should look similar to the following:
Although the build only takes about 90 seconds to complete, the EC2 instances could take a few extra minutes to complete their Status Checks and be completely ready. The final results in the AWS EC2 Management Console should look as follows:
Note each EC2 instance is running in a different US East 1 Availability Zone.
Installing Consul
Once the AWS infrastructure is running and the EC2 instances have completed their Status Checks successfully, we are ready to deploy Consul. In Jenkins, I have created the ‘Deploy Consul Cluster AWS’ project. This project has three tasks, which include: 1) SCM: clone the GitHub project, 2) Bindings: set up the AWS credentials, and 3) Build: run an SSH remote Docker command on each EC2 instance to deploy Consul. The SCM and Bindings tasks are identical to the project above. The project’s Build step looks as follows:
First, the delete_containers.sh
script deletes any previous instances of Consul containers. This is helpful if you need to re-deploy Consul. Next, the deploy_consul.sh
script executes a series of SSH remote Docker commands to install and configure Consul on each EC2 instance.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Advertised Consul IP | |
export ec2_server1_private_ip=$(aws ec2 describe-instances \ | |
–filters Name='tag:Name,Values=tf-instance-consul-server-1' \ | |
–output text –query 'Reservations[*].Instances[*].PrivateIpAddress') | |
echo "consul-server-1 private ip: ${ec2_server1_private_ip}" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Deploy Consul Server 1 | |
ec2_public_ip=$(aws ec2 describe-instances \ | |
–filters Name='tag:Name,Values=tf-instance-consul-server-1' \ | |
–output text –query 'Reservations[*].Instances[*].PublicIpAddress') | |
consul_server="consul-server-1" | |
ssh -oStrictHostKeyChecking=no -T \ | |
-i ~/.ssh/consul_aws_rsa \ | |
ubuntu@${ec2_public_ip} << EOSSH | |
docker run -d \ | |
–net=host \ | |
–hostname ${consul_server} \ | |
–name ${consul_server} \ | |
–env "SERVICE_IGNORE=true" \ | |
–env "CONSUL_CLIENT_INTERFACE=eth0" \ | |
–env "CONSUL_BIND_INTERFACE=eth0" \ | |
–volume /home/ubuntu/consul/data:/consul/data \ | |
–publish 8500:8500 \ | |
consul:latest \ | |
consul agent -server -ui -client=0.0.0.0 \ | |
-bootstrap-expect=3 \ | |
-advertise='{{ GetInterfaceIP "eth0" }}' \ | |
-data-dir="/consul/data" | |
sleep 5 | |
docker logs consul-server-1 | |
docker exec -i consul-server-1 consul members | |
EOSSH |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Deploy Consul Server 2 | |
ec2_public_ip=$(aws ec2 describe-instances \ | |
–filters Name='tag:Name,Values=tf-instance-consul-server-2' \ | |
–output text –query 'Reservations[*].Instances[*].PublicIpAddress') | |
consul_server="consul-server-2" | |
ssh -oStrictHostKeyChecking=no -T \ | |
-i ~/.ssh/consul_aws_rsa \ | |
ubuntu@${ec2_public_ip} << EOSSH | |
docker run -d \ | |
–net=host \ | |
–hostname ${consul_server} \ | |
–name ${consul_server} \ | |
–env "SERVICE_IGNORE=true" \ | |
–env "CONSUL_CLIENT_INTERFACE=eth0" \ | |
–env "CONSUL_BIND_INTERFACE=eth0" \ | |
–volume /home/ubuntu/consul/data:/consul/data \ | |
–publish 8500:8500 \ | |
consul:latest \ | |
consul agent -server -ui -client=0.0.0.0 \ | |
-advertise='{{ GetInterfaceIP "eth0" }}' \ | |
-retry-join="${ec2_server1_private_ip}" \ | |
-data-dir="/consul/data" | |
sleep 5 | |
docker logs consul-server-2 | |
docker exec -i consul-server-2 consul members | |
EOSSH |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Deploy Consul Server 3 | |
ec2_public_ip=$(aws ec2 describe-instances \ | |
–filters Name='tag:Name,Values=tf-instance-consul-server-3' \ | |
–output text –query 'Reservations[*].Instances[*].PublicIpAddress') | |
consul_server="consul-server-3" | |
ssh -oStrictHostKeyChecking=no -T \ | |
-i ~/.ssh/consul_aws_rsa \ | |
ubuntu@${ec2_public_ip} << EOSSH | |
docker run -d \ | |
–net=host \ | |
–hostname ${consul_server} \ | |
–name ${consul_server} \ | |
–env "SERVICE_IGNORE=true" \ | |
–env "CONSUL_CLIENT_INTERFACE=eth0" \ | |
–env "CONSUL_BIND_INTERFACE=eth0" \ | |
–volume /home/ubuntu/consul/data:/consul/data \ | |
–publish 8500:8500 \ | |
consul:latest \ | |
consul agent -server -ui -client=0.0.0.0 \ | |
-advertise='{{ GetInterfaceIP "eth0" }}' \ | |
-retry-join="${ec2_server1_private_ip}" \ | |
-data-dir="/consul/data" | |
sleep 5 | |
docker logs consul-server-3 | |
docker exec -i consul-server-3 consul members | |
EOSSH |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Output Consul Web UI URL | |
ec2_public_ip=$(aws ec2 describe-instances \ | |
–filters Name='tag:Name,Values=tf-instance-consul-server-1' \ | |
–output text –query 'Reservations[*].Instances[*].PublicIpAddress') | |
echo " " | |
echo "*** Consul UI: http://${ec2_public_ip}:8500/ui/ ***" |
The entire Jenkins build process only takes about 30 seconds. Afterward, the output from a successful Jenkins build should show that all three Consul server instances are running, have formed a quorum, and have elected a Leader.
Persisting State
The Consul Docker image exposes VOLUME /consul/data
, which is a path were Consul will place its persisted state. Using Terraform’s remote-exec
provisioner, we create a directory on each EC2 instance, at /home/ubuntu/consul/config
. The docker run
command bind-mounts the container’s /consul/data
path to the EC2 host’s /home/ubuntu/consul/config
directory.
According to Consul, the Consul server container instance will ‘store the client information plus snapshots and data related to the consensus algorithm and other state, like Consul’s key/value store and catalog’ in the /consul/data
directory. That container directory is now bind-mounted to the EC2 host, as demonstrated below.
Accessing Consul
Following a successful deployment, you should be able to use the public URL, displayed in the build output of the ‘Deploy Consul Cluster AWS’ project, to access the Consul UI. Clicking on the Nodes tab in the UI, you should see all three Consul server instances, one per EC2 instance, running and healthy.
Destroying Infrastructure
When you are finished with the post, you may want to remove the running infrastructure, so you don’t continue to get billed by Amazon. The ‘Destroy Consul Infra AWS’ project destroys all the AWS infrastructure, provisioned as part of this post, in about 60 seconds. The project’s SCM and Bindings tasks are identical to the both previous projects. The Build step calls the destroy_infra.sh
script, which is included in the GitHub project. The script executes the terraform destroy -force
command. It will delete all running infrastructure components associated with the post and update Terraform’s remote state.
Conclusion
This post has demonstrated how modern DevOps tooling, such as HashiCorp’s Packer and Terraform, make it easy to build, provision and manage complex cloud architecture. Using a CI/CD server, such as Jenkins, to securely automate the use of these tools, ensures quick and consistent results.
All opinions in this post are my own and not necessarily the views of my current employer or their clients.
#1 by nicolasmas on July 20, 2017 - 6:33 am
Nice post! Quick question: Why would you use containers for each consul server and not directly have them on the EC2 instance?
#2 by Gary A. Stafford on October 14, 2017 - 7:05 pm
Ease of deployment and management
#3 by Alexander Land on October 13, 2017 - 8:21 am
Thanks for the article – interesting and useful.
I am also interested why Docker instead of direct installation of Consul binaries.
At least one reason is obvious – it is easier to install and start / stop – but it comes with a price – I heard that it is a bad idea to use Docker containers in production as they crash from time to time without any visible reason.
Thanks
#4 by Kranthikumar Parupally on October 14, 2017 - 6:14 pm
thanks man’
#5 by Ron on July 5, 2018 - 8:59 am
an extremely helpful example of manually starting a consul cluster. thanks!