Posts Tagged Microservices

The Evolving Role of DevOps in Emerging Technologies: Part 1/2

31970399_m

Growth of DevOps

The adoption of DevOps practices by global organizations has become mainstream, according to many recent industry studies. For instance, a late 2016 study, conducted by IDG Research for Unisys Corporation of global enterprise organizations, found 38 percent of respondents had already adopted DevOps, while another 29 percent were in the planning phase, and 17 percent in the evaluation stage. Adoption rates were even higher, 49 percent versus 38 percent, for larger organizations with 500 or more developers.

Another recent 2017 study by Red Gate Software, The State of Database DevOps, based on 1,000 global organizations, found 47 percent of the respondents had already adopted DevOps practices, with another 33 percent planning on adopting DevOps practices within the next 24 months. Similar to the Unisys study, prior adoption rates were considerably higher, 59 percent versus 47 percent, for larger organizations with over 1,000 employees.

Emerging Technologies

Although DevOps originated to meet the needs of Agile software development to release more frequently, DevOps is no longer just continuous integration and continuous delivery. As more organizations undergo a digital transformation and adopt disruptive technologies to drive business success, the role of DevOps continues to evolve and expand.

Emerging technology trends, such as Machine Learning, Artificial Intelligence (AI), and Internet of Things (IoT/IIoT), serve to both influence DevOps practices, as well as create the need for the application of DevOps practices to these emerging technologies. Let’s examine the impact of some of these emerging technology trends on DevOps in this brief, two-part post.

Mobile

Although mobile application development is certainly not new, DevOps practices around mobile continue to evolve as mobile becomes the primary application platform for many organizations. Mobile applications have unique development and operational requirements. Take for example UI functional testing. Whereas web application developers often test against a relatively small matrix of popular web browsers and operating systems (Desktop Browser Market Share – Net Application.com), mobile developers must test against a continuous outpouring of new mobile devices, both tablets and phones (Test on the right mobile devices – BroswerStack). The complexity of automating the testing of such a large number mobile devices has resulted in the growth of specialized cloud-based testing platforms, such as BrowserStack and SauceLabs.

Cloud

Similar to Mobile, the Cloud is certainly not new. However, as more firms move their IT operations to the Cloud, DevOps practices have had to adapt rapidly. The need to adjust is no more apparent than with Amazon Web Services. Currently, AWS lists no less than 18 categories of cloud offerings on their website, with each category containing several products and services. Categories include compute, storage, databases, networking, security, messaging, mobile, AI, IoT, and analytics.

In addition to products like compute, storage, and database, AWS now offers development, DevOps, and management tools, such as AWS OpsWorks and AWS CloudFormation. These products offer alternatives to traditional non-cloud CI/CD/RM workflows for deploying and managing complex application platforms on AWS. Learning the nuances of a growing list of AWS specific products and workflows, while simultaneously adapting your organization’s DevOps practices to them, has resulted in a whole new category of DevOps engineering specialization centered around AWS. Cloud-centric DevOps engineering specialization is also seen with other large cloud providers, such as Microsoft Azure and Google Cloud Platform.

Security

Call it DevSecOps, SecDevOps, SecOps, or Rugged DevOps, the intersection of DevOps and Security is bustling these days. As the complexity of modern application platforms grows, as well as the sophistication of threats from hackers and the requirements of government and industry compliance, security is no longer an afterthought or a process run in seeming isolation from software development and DevOps. In my recent experience, it is not uncommon to see IT security specialists actively participating on Agile development teams and embedded on DevOps and Platform teams.

Modern application platforms must be designed from day one to be bug-free, performant, compliant, and secure.

Security practices are now commonly part of the entire software development lifecycle, including enterprise architecture, software development, data governance, continuous testing, and infrastructure as code. Modern application platforms must be designed from day one to be bug-free, performant, compliant, and secure.

Take for example penetration (PEN) testing. Once a mostly manual process, done close to release time, evolving DevOps practices now allow testing for security vulnerabilities to applications and software-defined infrastructure to be done early and often in the software development lifecycle. Easily automatable and configurable cloud- and non-cloud-based tools like SonarQube, Veracode, QualysOWASP ZAP, and Chef Compliance, amongst others, are frequently incorporated into continuous integration workflows by development and DevOps teams. There is no longer an excuse for security vulnerabilities to be discovered just before release, or worse, in Production.

Modern Platforms

Along with the Cloud, modern application development trends, like the rise of the platform, microservices (or service-based architectures), containerization, NoSQL databases, and container orchestration, have likely provided the majority of fuel for the recent explosive growth of DevOps. Although innovative IT organizations have fostered these technologies for the past few years, their growth and relative maturity have risen sharply in the last 12 to 18 months.

No longer the stuff of Unicorns, platforms based on Evolutionary Architectures are being built and deployed by an increasing number of everyday organizations.

No longer the stuff of Unicorns, such as Amazon, Etsy, and Netflix, platforms based on Evolutionary Architectures are being built and deployed by an increasing number of everyday organizations. Although complexity continues to rise, the barrier to entry has been greatly reduced with technologies found across the SDLC, including  Node, Spring Boot, Docker, Consul, Terraform, and Kubernetes, amongst others.

As modern platforms become more commonplace, the DevOps practices around them continue to mature and become specialized. Imagine, with potentially hundreds of moving parts, building, testing, deploying, and actively managing a large-scale microservice-based application on a container orchestration platform requires highly-specialized knowledge. The ability to ‘do DevOps at scale’ is critical.

Legacy Systems

Legacy systems as an emerging technology trend in DevOps? As the race to build the ‘next generation’ of application platforms accelerates to meet the demands of the business and their customers, there is a growing need to support ‘last generation’ systems. Many IT organizations support multiple legacy systems, ranging in age from as short as five years old to more than 25 years old. These monolithic legacy systems, which often contain a company’s secret sauce, such as complex business algorithms and decision engines, are built on out-moded technology stacks, often lack vendor support, and require separate processes to build, test, deploy, and manage. Worse, the knowledge to maintain these systems is frequently only known to a shrinking group of IT resources. Who wants to work on the old system with so many bright and shiny toys being built?

As a cost-effective means to maintain these legacy systems, organizations are turning to modern DevOps practices. Although not possible to the same degree, depending on the legacy technology, practices include the use source control, various types of automated testing, automated provisioning, deployment and configuration of system components, and infrastructure automation (DevOps for legacy systems – Infosys white paper).

Not specifically a DevOps practice, organizations are also implementing content collaboration systems, like Atlassian Confluence and Microsoft SharePoint, to document legacy system architectures and manual processes, before the resources and their knowledge is lost.

Part Two

In the second half of this post, we will continue to look at emerging technologies and their impact on DevOps, including:

  • Big Data
  • Internet of Things (IoT/IIoT)
  • Artificial Intelligence (AI)
  • Machine Learning
  • COTS/SaaS

All opinions in this post are my own and not necessarily the views of my current employer or their clients.

Illustration Copyright: Andreus / 123RF Stock Photo

, , , , , , ,

Leave a comment

Eventual Consistency: Decoupling Microservices with Spring AMQP and RabbitMQ

RabbitMQEnventCons.png

Introduction

In a recent post, Decoupling Microservices using Message-based RPC IPC, with Spring, RabbitMQ, and AMPQ, we moved away from synchronous REST HTTP for inter-process communications (IPC) toward message-based IPC. Moving to asynchronous message-based communications allows us to decouple services from one another. It makes it easier to build, test, and release our individual services. In that post, we did not achieve fully asynchronous communications. Although, we did achieve a higher level of service decoupling using message-based Remote Procedure Call (RPC) IPC.

In this post, we will fully decouple our services using the distributed computing model of eventual consistency. More specifically, we will use a message-based, event-driven, loosely-coupled, eventually consistent architectural approach for communications between services.

What is eventual consistency? One of the best definitions of eventual consistency I have read was posted on microservices.io. To paraphrase, ‘using an event-driven, eventually consistent approach, each service publishes an event whenever it updates its data. Other services subscribe to events. When an event is received, a service updates its data.

Example of Eventual Consistency

Imagine Service A, the Customer service, inserts a new customer record into its database. Based on that ‘customer created’ event, Service A publishes a message containing the new customer object, serialized to JSON, to a lightweight persistent message queue.

Service B, the new customer Onboarding service, a subscriber to that queue, consumes Service A’s message. Service B then executes that same CRUD operation, inserting the same new customer record into its database.

In the above example, it can be said that the customer records in Service B’s database are eventually consistent with the customer records in Service A’s database. Service A makes a change and publishes a message in response to the event. Service B consumes the message and makes the same change. Eventually (likely milliseconds) Service B’s customer records are consistent with Service A’s customer records.

Why Eventual Consistency?

So what does this apparent added complexity and duplication of data buy us? Consider the advantages. Service B, the Onboarding service, requires no knowledge of, or a dependency on, Service A, the Customer service. Still, Service B has a current record of all the customers that Service A maintains. Instead of making repeated and potentially costly REST HTTP call or RPC message-based call to or from Service A to Service B for new customers, Service B queries its database for a list of customers.

The value of eventual consistency increases factorially as you scale a distributed system. Imagine dozens of distinct microservices, many requiring data from other microservices. Further, imagine multiple instances of each of those services all running in parallel. Decoupling services from one another, through asynchronous forms of IPC, messaging, and event-driven eventual consistency greatly simplifies the software development lifecycle and operations.

Demonstration

In this post, we could use a few different architectural patterns to demonstrate message passing with RabbitMQ and Spring AMQP. They including Work Queues, Publish/Subscribe, Routing, or Topics. To keep things as simple as possible, we will have a single Producer, publish messages to a single durable and persistent message queue. We will have a single Subscriber, a Consumer, consume the messages from that queue. We focus on a single type of event message.

Sample Code

To demonstrate Spring AMQP-based messaging with RabbitMQ, we will use a reference set of three Spring Boot microservices. The Election ServiceCandidate Service, and Voter Service are all backed by MongoDB. The services and MongoDB, along with RabbitMQ and Voter API Gateway, are all part of the Voter API.

The Voter API Gateway, based on HAProxy, serves as a common entry point to all three services, as well as serving as a reverse proxy and load balancer. The API Gateway provides round-robin load-balanced access to multiple instances of each service.

Voter_API_Architecture

All the source code found this post’s example is available on GitHub, within a few different project repositories. The Voter Service repository contains the Voter service source code, along with the scripts and Docker Compose files required to deploy the project. The Election Service repository, Candidate Service repository, and Voter API Gateway repository are also available on GitHub. There is also a new AngularJS/Node.js Web Client, to demonstrate how to use the Voter API.

For this post, you only need to clone the Voter Service repository.

Deploying Voter API

All components, including the Spring Boot services, MongoDB, RabbitMQ, API Gateway, and the Web Client, are individually deployed using Docker. Each component is publicly available as a Docker Image, on Docker Hub. The Voter Service repository contains scripts to deploy the entire set of Dockerized components, locally. The repository also contains optional scripts to provision a Docker Swarm, using Docker’s newer swarm mode, and deploy the components. We will only deploy the services locally for this post.

To clone and deploy the components locally, including the Spring Boot services, MongoDB, RabbitMQ, and the API Gateway, execute the following commands. If this is your first time running the commands, it may take a few minutes for your system to download all the required Docker Images from Docker Hub.

If everything was deployed successfully, you should observe six running Docker containers, similar to the output, below.

Using Voter API

The Voter Service, Election Service, and Candidate Service GitHub repositories each contain README files, which detail all the API endpoints each service exposes, and how to call them.

In addition to casting votes for candidates, the Voter service can simulate election results. Calling the /simulation endpoint, and indicating the desired election, the Voter service will randomly generate a number of votes for each candidate in that election. This will save us the burden of casting votes for this demonstration. However, the Voter service has no knowledge of elections or candidates. The Voter service depends on the Candidate service to obtain a list of candidates.

The Candidate service manages electoral candidates, their political affiliation, and the election in which they are running. Like the Voter service, the Candidate service also has a /simulation endpoint. The service will create a list of candidates based on the 2012 and 2016 US Presidential Elections. The simulation capability of the service saves us the burden of inputting candidates for this demonstration.

The Election service manages elections, their polling dates, and the type of election (federal, state, or local). Like the other services, the Election service also has a /simulation endpoint, which will create a list of sample elections. The Election service will not be discussed in this post’s demonstration. We will examine communications between the Candidate and Voter services, only.

REST HTTP Endpoint

As you recall from our previous post, Decoupling Microservices using Message-based RPC IPC, with Spring, RabbitMQ, and AMPQ, the Voter service exposes multiple, almost identical endpoints. Each endpoint uses a different means of IPC to retrieve candidates and generates random votes.

Calling the /voter/simulation/election/{election} endpoint and providing a specific election, prompts the Voter service to request a list of candidates from the Candidate service, based on the election parameter you input. This request is done using synchronous REST HTTP. The Voter service uses the HTTP GET method to request the data from the Candidate service. The Voter service then waits for a response.

The Candidate service receives the HTTP request. The Candidate service responds to the Voter service with a list of candidates in JSON format. The Voter service receives the response payload containing the list of candidates. The Voter service then proceeds to generate a random number of votes for each candidate in the list. Finally, each new vote object (MongoDB document) is written back to the vote collection in the Voter service’s voters  database.

Message-based RPC Endpoint

Similarly, calling the /voter/simulation/rpc/election/{election} endpoint and providing a specific election, prompts the Voter service to request the same list of candidates. However, this time, the Voter service (the client) produces a request message and places in RabbitMQ’s voter.rpc.requests queue. The Voter service then waits for a response. The Voter service has no direct dependency on the Candidate service; it only depends on a response to its request message. In this way, it is still a synchronous form of IPC, but the Voter service is now decoupled from the Candidate service.

The request message is consumed by the Candidate service (the server), who is listening to that queue. In response, the Candidate service produces a message containing the list of candidates serialized to JSON. The Candidate service (the server) sends a response back to the Voter service (the client) through RabbitMQ. This is done using the Direct reply-to feature of RabbitMQ or using a unique response queue, specified in the reply-to header of the request message, sent by the Voter Service.

The Voter service receives the message containing the list of candidates. The Voter service deserializes the JSON payload to candidate objects. The Voter service then proceeds to generate a random number of votes for each candidate in the list. Finally, identical to the previous example, each new vote object (MongoDB document) is written back to the vote collection in the Voter service’s voters database.

New Endpoint

Calling the new /voter/simulation/db/election/{election} endpoint and providing a specific election, prompts the Voter service to query its own MongoDB database for a list of candidates.

But wait, where did the candidates come from? The Voter service didn’t call the Candidate service? The answer is message-based eventual consistency. Whenever a new candidate is created, using a REST HTTP POST request to the Candidate service’s /candidate/candidates endpoint, a Spring Data Rest Repository Event Handler responds. Responding to the candidate created event, the event handler publishes a message, containing a serialized JSON representation of the new candidate object, to a durable and persistent RabbitMQ queue.

The Voter service is listening to that queue. The Voter service consumes messages off the queue, deserializes the candidate object, and saves it to its own voters database, to the candidate collection. For this example, we are saving the incoming candidate object as is, with no transformations. The candidate object model for both services is identical.

When /voter/simulation/db/election/{election} endpoint is called, the Voter service queries its voters database for a list of candidates. They Voter service then proceeds to generate a random number of votes for each candidate in the list. Finally, identical to the previous two examples, each new vote object (MongoDB document) is written back to the vote collection in the Voter service’s voters  database.

Message_Queue_Diagram_Final3B

Exploring the Code

We will not review the REST HTTP or RPC IPC code in this post. It was covered in detail, in the previous post. Instead, we will explore the new code required for eventual consistency.

Spring Dependencies

To use AMQP with RabbitMQ, we need to add a project dependency on org.springframework.boot.spring-boot-starter-amqp. Below is a snippet from the Candidate service’s build.gradle file, showing project dependencies. The Voter service’s dependencies are identical.

AMQP Configuration

Next, we need to add a small amount of RabbitMQ AMQP configuration to both services. We accomplish this by using Spring’s @Configuration annotation on our configuration classes. Below is the abridged configuration class for the Voter service.

And here, the abridged configuration class for the Candidate service.

Event Handler

With our dependencies and configuration in place, we will define the CandidateEventHandler class. This class is annotated with the Spring Data Rest @RepositoryEventHandler and Spring’s @Component. The @Component annotation ensures the event handler is registered.

The class contains the handleCandidateSave method, which is annotated with the Spring Data Rest @HandleAfterCreate. The event handler acts on the Candidate object, which is the first parameter in the method signature.

Responding to the candidate created event, the event handler publishes a message, containing a serialized JSON representation of the new candidate object, to the candidates.queue queue. This was the queue we configured earlier.

Consuming Messages

Next, we let’s switch to the Voter service’s CandidateListService class. Below is an abridged version of the class with two new methods. First, the getCandidateMessage method listens to the candidates.queue queue. This was the queue we configured earlier. The method is annotated with theSpring AMQP Rabbit @RabbitListener annotation.

The getCandidateMessage retrieves the new candidate object from the message, deserializes the message’s JSON payload, maps it to the candidate object model and saves it to the Voter service’s database.

The second method, getCandidatesQueueDb, retrieves the candidates from the Voter service’s database. The method makes use of the Spring Data MongoDB Aggregation package to return a list of candidates from MongoDB.

RabbitMQ Management Console

The easiest way to observe what is happening with the messages is using the RabbitMQ Management Console. To access the console, point your web browser to localhost, on port 15672. The default login credentials for the console are guest/guest. As you successfully produce and consume messages with RabbitMQ, you should see activity on the Overview tab.

RabbitMQ_EC_Durable3.png

Recall we said the queue, in this example, was durable. That means messages will survive the RabbitMQ broker stopping and starting. In the below view of the RabbitMQ Management Console, note the six messages persisted in memory. The Candidate service produced the messages in response to six new candidates being created. However, the Voter service was not running, and therefore, could not consume the messages. In addition, the RabbitMQ server was restarted, after receiving the candidate messages. The messages were persisted and still present in the queue after the successful reboot of RabbitMQ.

RabbitMQ_EC_Durable

Once RabbitMQ and the Voter service instance were back online, the Voter service successfully consumed the six waiting messages from the queue.

RabbitMQ_EC_Durable2.png

Service Logs

In addition to using the RabbitMQ Management Console, we may obverse communications between the two services by looking at the Voter and Candidate service’s logs. I have grabbed a snippet of both service’s logs and added a few comments to show where different processes are being executed.

First the Candidate service logs. We observe a REST HTTP POST request containing a new candidate. We then observe the creation of the new candidate object in the Candidate service’s database, followed by the event handler publishing a message on the queue. Finally, we observe the response is returned in reply to the initial REST HTTP POST request.

Now the Voter service logs. At the exact same second as the message and the response sent by the Candidate service, the Voter service consumes the message off the queue. The Voter service then deserializes the new candidate object and inserts it into its database.

MongoDB

Using the mongo Shell, we can observe six new 2016 Presidential Election candidates in the Candidate service’s database.

Now, looking at the Voter service’s database, we should find the same six 2016 Presidential Election candidates. Note the Object IDs are the same between the two service’s document sets, as are the rest of the fields (first name, last name, political party, and election). However, the class field is different between the two service’s records.

Production Considerations

The post demonstrated a simple example of message-based, event-driven eventual consistency. In an actual Production environment, there are a few things that must be considered.

  • We only addressed a ‘candidate created’ event. We would also have to code for other types of events, such as a ‘candidate deleted’ event and an ‘candidate updated’ event.
  • If a candidate is added, deleted, then re-added, are the events published and consumed in the right order? What about with multiple instances of the Voter service running? Does this pattern guarantee event ordering?
  • How should the Candidate service react on startup if RabbitMQ is not available
  • What if RabbitMQ fails after the Candidate services have started?
  • How should the Candidate service react if a new candidate record is added to the database, but a ‘candidate created’ event message cannot be published to RabbitMQ? The two actions are not wrapped in a single transaction.
  • In all of the above scenarios, what response should be returned to the API end user?

Conclusion

In this post, using eventual consistency, we successfully decoupled our two microservices and achieved asynchronous inter-process communications. Adopting a message-based, event-driven, loosely-coupled architecture, wherever possible, in combination with REST HTTP when it makes sense, will improve the overall manageability and scalability of a microservices-based platform.

References

All opinions in this post are my own and not necessarily the views of my current employer or their clients.

, , , , , , , , , , ,

Leave a comment

Decoupling Microservices using Message-based RPC IPC, with Spring, RabbitMQ, and AMPQ

RabbitMQ_Screen_3

Introduction

There has been a considerable growth in modern, highly scalable, distributed application platforms, built around fine-grained RESTful microservices. Microservices generally use lightweight protocols to communicate with each other, such as HTTP, TCP, UDP, WebSockets, MQTT, and AMQP. Microservices commonly communicate with each other directly using REST-based HTTP, or indirectly, using messaging brokers.

There are several well-known, production-tested messaging queues, such as Apache Kafka, Apache ActiveMQAmazon Simple Queue Service (SQS), and Pivotal’s RabbitMQ. According to Pivotal, of these messaging brokers, RabbitMQ is the most widely deployed open source message broker.

RabbitMQ supports multiple messaging protocols. RabbitMQ’s primary protocol, the Advanced Message Queuing Protocol (AMQP), is an open standard wire-level protocol and semantic framework for high-performance enterprise messaging. According to Spring, ‘AMQP has exchanges, routes, and queues. Messages are first published to exchanges. Routes define on which queue(s) to pipe the message. Consumers subscribing to that queue then receive a copy of the message.

Pivotal’s Spring AMQP project applies core Spring concepts to the development of AMQP-based messaging solutions. The project’s libraries facilitate management of AMQP resources while promoting the use of dependency injection and declarative configuration. The project provides a ‘template’ (RabbitTemplate) as a high-level abstraction for sending and receiving messages.

In this post, we will explore how to start moving Spring Boot Java services away from using synchronous REST HTTP for inter-process communications (IPC), and toward message-based IPC. Moving from synchronous IPC to messaging queues and asynchronous IPC decouples services from one another, allowing us to more easily build, test, and release individual microservices.

Message-Based RPC IPC

Decoupling services using asynchronous IPC is considered optimal by many enterprise software architects when developing modern distributed platforms. However, sometimes it is not easy or possible to get away from synchronous communications. Rightly or wrongly, often times services are architected, such that one service needs to retrieve data from another service or services, in order to process its own requests. It can be said, that service has a direct dependency on the other services. Many would argue, services, especially RESTful microservices, should not be coupled in this way.

There are several ways to break direct service-to-service dependencies using asynchronous IPC. We might implement request/async response REST HTTP-based IPC. We could also use publish/subscribe or publish/async response messaging queue-based IPC. These are all described by NGINX, in their article, Building Microservices: Inter-Process Communication in a Microservices Architecture; a must-read for anyone working with microservices. We might also implement an architecture which supports eventual consistency, eliminating the need for one service to obtain data from another service.

So what if we cannot implement asynchronous methods to break direct service dependencies, but we want to move toward message-based IPC? One answer is message-based Remote Procedure Call (RPC) IPC. I realize the mention of RPC might send cold shivers down the spine of many seasoned architected. Traditional RPC has several challenges, many which have been overcome with more modern architectural patterns.

According to Wikipedia, ‘in distributed computing, a remote procedure call (RPC) is when a computer program causes a procedure (subroutine) to execute in another address space (commonly on another computer on a shared network), which is coded as if it were a normal (local) procedure call, without the programmer explicitly coding the details for the remote interaction.

Although still a form of RPC and not asynchronous, it is possible to replace REST HTTP IPC with message-based RPC IPC. Using message-based RPC, services have no direct dependencies on other services. A service only depends on a response to a message request it makes to that queue. The services are now decoupled from one another. The requestor service (the client) has no direct knowledge of the respondent service (the server).

RPC with RabbitMQ and AMQP

RabbitMQ has an excellent set of six tutorials, which cover the basics of creating messaging applications, applying different architectural patterns, using RabbitMQ, in several different programming languages. The sixth and final tutorial covers using RabbitMQ for RPC-based IPC, with the request/reply architectural pattern.

Pivotal recently added Spring AMPQ implementations to each RabbitMQ tutorial, based on their Spring AMQP project. If you recall, the Spring AMQP project applies core Spring concepts to the development of AMQP-based messaging solutions.

This post’s RPC IPC example is closely based on the architectural pattern found in the Spring AMQP RabbitMQ tutorial.

Sample Code

To demonstrate Spring AMQP-based RPC IPC messaging with RabbitMQ, we will use a pair of simple Spring Boot microservices. These services, the Voter and Candidate services, have been used in several previous posts, and for training and testing DevOps engineers. Both services are backed by MongoDB. The services and MongoDB, along with RabbitMQ, are all part of the Voter API project. The Voter API project also contains an HAProxy-based API Gateway, which provides indirect, load-balanced access to the two services.

All code necessary to build this post’s example is available on GitHub, within three projects. The Voter Service project repository contains the Voter service source code, along with the scripts and Docker Compose files required to deploy the project. The Candidate Service project repository and the Voter API Gateway project repository are also available on GitHub. For this post, you need only clone the Voter Service project repository.

Deploying Voter API

All components, including the two Spring services, MongoDB, RabbitMQ, and the API Gateway, are individually deployed using Docker. Each component is publicly available as a Docker Image, on Docker Hub.

The Voter Service repository contains scripts to deploy the entire set of Dockerized components, locally. The repository also contains optional scripts to provision a Docker Swarm, using Docker’s newer swarm mode, and deploy the components. We will only deploy the services locally for this post.

To clone and deploy the components locally, including the two Spring services, MongoDB, RabbitMQ, and the API Gateway, execute the following commands. If this is your first time running the commands, it may take a few minutes for your system to download all the required Docker Images from Docker Hub.

If everything was deployed successfully, you should see the following output. You should observe five running Docker containers.

Using Voter API

The Voter Service and Candidate Service GitHub repositories both contain README files, which detail all the API endpoints each service exposes, and how to call them.

In addition to casting votes for candidates, the Voter service has the ability to simulate election results. By calling a /simulation endpoint, and indicating the desired election, the Voter service will randomly generate a number of votes for each candidate in that election. This will save us the burden of casting votes for this demonstration. However, the Voter service has no knowledge of elections or candidates. To obtain a list of candidates, the Voter service depends on the Candidate service.

The Candidate service manages electoral candidates, their political affiliation, and the election in which they are running. Like the Voter service, the Candidate service also has a /simulation endpoint. The service will create a list of candidates based on the 2012 and 2016 US Presidential Elections. The simulation capability of the service saves us the burden of inputting candidates for this demonstration.

REST HTTP Endpoint

The Voter service exposes two almost identical endpoints. Both endpoints generate random votes. However, below the covers, the two endpoints are very different. Calling the /voter/simulation/election/{election} endpoint, prompts the Voter service to request a list of candidates from the Candidate service, based on the election parameter you input. This request is done using synchronous REST HTTP. The Voter service uses the HTTP GET method to request the data from the Candidate service. The Voter service then waits for a response.

The HTTP request is received by the Candidate service. The Candidate service responds to the Voter service with a list of candidates, in JSON format. The Voter service receives the response containing the list of candidates. The Voter service then proceeds to generate a random number of votes for each candidate. Finally, each new vote object (MongoDB document) is written back to the vote collection in the Voter service’s voters  database.

Message Queue Diagram 1D

Message-based RPC Endpoint

Similarly, calling the /voter/simulation/rpc/election/{election} endpoint with a specific election prompts the Voter service to request the same list of candidates. However, this time, the Voter service (the client), produces a request message and places in RabbitMQ’s voter.rpc.requests queue. The Voter service then waits for a response. The Voter service has no direct dependency on the Candidate service. It only depends on a response to its message request. In this way, it is still a synchronous form of IPC, but the Voter service is now decoupled from the Candidate service.

The request message is consumed by the Candidate service (the server), who is listening to that queue. In response, the Candidate service produces a message containing the list of candidates, serialized to JSON. The Candidate service (the server) sends a response back to the Voter service (the client), through RabbitMQ. This is done using the Direct reply-to feature of RabbitMQ, or using a unique response queue, specified in the reply-to header of the request message, sent by the Voter Service.

According to RabbitMQ, ‘the direct reply-to feature allows RPC clients to receive replies directly from their RPC server, without going through a reply queue. (“Directly” here still means going through AMQP and the RabbitMQ server; there is no separate network connection between RPC client and RPC server.)

According to Spring, ‘starting with version 3.4.0, the RabbitMQ server now supports Direct reply-to; this eliminates the main reason for a fixed reply queue (to avoid the need to create a temporary queue for each request). Starting with Spring AMQP version 1.4.1 Direct reply-to will be used by default (if supported by the server) instead of creating temporary reply queues. When no replyQueue is provided (or it is set with the name amq.rabbitmq.reply-to), the RabbitTemplate will automatically detect whether Direct reply-to is supported and either uses it or fall back to using a temporary reply queue. When using Direct reply-to, a reply-listener is not required and should not be configured.’ We are using the latest versions of both RabbitMQ and Spring AMQP, which should support Direct reply-to.

The Voter service receives the message containing the list of candidates. The Voter service deserializes the JSON payload to Candidate objects and proceeds to generate a random number of votes for each candidate in the list. Finally, each new vote object (MongoDB document) is written back to the vote collection in the Voter service’s voters  database.

Message Queue Diagram 2D

Exploring the RPC Code

We will not examine the REST HTTP IPC code in this post. Instead, we will explore the RPC code. You are welcome to download the source code and explore the REST HTTP code pattern; it uses some advanced features of Spring Boot and Spring Data.

Spring Dependencies

In order to use RabbitMQ, we need to add a project dependency on org.springframework.boot.spring-boot-starter-amqp. Below is a snippet from the Candidate service’s build.gradle file, showing project dependencies. The Voter service’s dependencies are identical.

AMQP Configuration

Next, we need to add a small amount of RabbitMQ AMQP configuration to both services. We accomplish this by using Spring’s @Configuration annotation on our configuration classes. Below is the configuration class for the Voter service.

And here, the configuration class for the Candidate service.

Candidate Service Code

With the dependencies and configuration in place, we define the method in the Voter service, which will request the candidates from the Candidate service, using RabbitMQ. Below is an abridged version of the Voter service’s CandidateListService class, containing the getCandidatesMessageRpc method. This method calls the rabbitTemplate.convertSendAndReceive method (see line 5, below).

Voter Service Code

Next, we define a method in the Candidate service, which will process the Voter service’s request. Below is an abridged version of the CandidateController class, containing the getCandidatesMessageRpc method. This method is decorated with Spring’s @RabbitListener annotation (see line 1, below). This annotation marks c to be the target of a Rabbit message listener on the voter.rpc.requests queue.

Also shown, are the getCandidatesMessageRpc method’s two helper methods, getByElection and serializeToJson. These methods query MongoDB for the list of candidates and serialize the list to JSON.

Demonstration

To demonstrate both the synchronous REST HTTP IPC code and the Spring AMQP-based RPC IPC code, we will make a few REST HTTP calls to the Voter API Gateway. For convenience, I have provided a shell script, demostrate_ipc.sh, which executes all the API calls necessary. I have added sleep commands to slow the output to the terminal down a bit, for easier analysis. The script requires HTTPie, a great time saver when working with RESTful services.

The demostrate_ipc.sh script does three things. First, it calls the Candidate service to generate a group of sample candidates. Next, the script calls the Voter service to simulate votes, using synchronous REST HTTP. Lastly, the script repeats the voter simulation, this time using message-based RPC IPC. All API calls are done through the Voter API Gateway on port 8080. To understand the API calls, examine the script, below.

Below is the list of candidates for the 2016 Presidential Election, generated by the Candidate service. The JSON payload was retrieved using the Voter service’s /voter/candidates/rpc/election/{election} endpoint. This endpoint uses the same RPC IPC method as the Voter service’s /voter/simulation/rpc/election/{election} endpoint.

Based on the list of candidates, below are the simulated election results. This JSON payload was retrieved using the Voter service’s /voter/results endpoint.

RabbitMQ Management Console

The easiest way to observe what is happening with our messages is using the RabbitMQ Management Console. To access the console, point your web-browser to localhost, on port 15672. The default login credentials for the console are guest/guest.

As you successfully send and receive messages between the services through RabbitMQ, you should see activity on the Overview tab. In addition, you should see a number of Connections, Channels, Exchanges, Queues, and Consumers.

RabbitMQ_Screen_3

In the Queues tab, you should find a single queue, the voter.rpc.requests queue. This queue was configured in the Candidate service’s configuration class, shown previously.

RabbitMQ_Screen_2

In the Exchanges tab, you should see one exchange, voter.rpc, which we configured in both the Voter and the Candidate service’s configuration classes (aka DirectExchange). Also, visible in the Exchanges tab, should be the routing key rpc, which we configured in the Candidate service’s configuration class (aka Binding).

The route binds the exchange to the voter.rpc.requests queue. If you recall Spring’s description, AMQP has exchanges (DirectExchange), routes (Binding), and queues (Queue). Messages are first published to exchanges. Routes define on which queue(s) to pipe the message. Consumers subscribing to that queue then receive a copy of the message.

RabbitMQ_Screen_1

In the Channels tab, you should note two connections, the single instances of the Voter and Candidate services. Likewise, there are two channels, one for each service. You can differentiate the channels by the presence of the consumer tag. The consumer tag, in this example, amq.ctag-Anv7GXs7ZWVoznO64euyjQ, uniquely identifies the consumer. In this example, the Voter service is the consumer. For a more complete explanation of the consumer tag, check out RabbitMQ’s AMQP documentation.

RabbitMQ_Screen_4.png

Message Structure

Messages cannot be viewed directly in the RabbitMQ Management Console. One way I have found to view messages is using your IDE’s debugger. Below, I have added a breakpoint on the Candidate service’ getCandidatesMessageRpc method, using IntelliJ IDEA. You can view the Voter service’s request message, as it is received by the Candidate service.

Debug_RPC_Message.png

Note the message payload, the requested election. Note the twelve message header elements. The headers include the AMQP exchange, queue, and binding. The message headers also include the consumer tag. The message also uniquely identifies the reply-to queue to use, if the server does not support Direct reply-to (see earlier explanation).

Service Logs

In addition to the RabbitMQ Management Console, we may obverse communications between the two services, by looking at the Voter and Candidate service’s logs. I have grabbed a snippet of both service’s logs and added a few comments to show where different processes are being executed. First the Voter service logs.

Next, the Candidate service logs.

Performance

What about the performance of Spring AMQP RPC IPC versus REST HTTP IPC? RabbitMQ has proven to be very performant, having been clocked at one million messages per second on GCE. I performed a series of fairly ‘unscientific’ performance tests, completing 250, 500, and then 1,000 requests. The tests were performed on a six-node Docker Swarm cluster with three instances of each service in a round-robin load-balanced configuration, and a single instance of RabbitMQ. The scripts to create the swarm cluster can be found in the Voter service GitHub project.

Based on consistent test results, the speed of the two methods was almost identical. Both methods performed between 3.1 to 3.2 responses per second. For example, the Spring AMQP RPC IPC method successfully completed 1,000 requests in 5 minutes and 11 seconds, while the REST HTTP IPC method successfully completed 1,000 requests in 5 minutes and 18 seconds, 7 seconds slower than the RPC method.

RabbitMQ on Docker Swarm

There are many variables to consider, which could dramatically impact IPC performance. For example, RabbitMQ was not clustered. Also, we did not use any type of caching, such as Varnish, Memcached, or Redis. Both these could dramatically increase IPC performance.

There are also several notable differences between the two methods from a code perspective. The REST HTTP method relies on Spring Data Projection combined with Spring Data MongoDB Repository, to obtain the candidate list from MongoDB. Somewhat differently, the RPC method makes use of Spring Data MongoDB Aggregation to return a list of candidates. Therefore, the test results should be taken with a grain of salt.

Production Considerations

The post demonstrated a simple example of RPC communications between two services using Spring AMQP. In an actual Production environment, there are a few things that must be considered, as Pivotal points out:

  • How should either service react on startup if RabbitMQ is not available? What if RabbitMQ fails after the services have started?
  • How should the Voter server (the client) react if there are no Candidate service instances (the server) running?
  • Should the Voter service have a timeout for the RPC response to return? What should happen if the request times out?
  • If the Candidate service malfunctions and raises an exception, should it be forwarded to the Voter service?
  • How does the Voter service protect against invalid incoming messages (eg checking bounds of the candidate list) before processing?
  • In all of the above scenarios, what, if any, response is returned to the API end user?

Conclusion

Although in this post we did not achieve asynchronous inter-process communications, we did achieve a higher level of service decoupling, using message-based RPC IPC. Adopting a message-based, loosely-coupled architecture, whether asynchronous or synchronous, wherever possible, will improve the overall functionality and deliverability of a microservices-based platform.

References

All opinions in this post are my own and not necessarily the views of my current employer or their clients.

, , , , , , , , ,

Leave a comment

Containerized Microservice Log Aggregation and Visualization using ELK Stack and Logspout

Log aggregation, visualization, analysis, and monitoring of Dockerized microservices using the ELK Stack (Elasticsearch, Logstash, and Kibana) and Logspout

Kibana Dashboard

Introduction

In the last series of posts, we learned how to use Jenkins CI, Maven, DockerDocker Compose, and Docker Machine to take a set of Java-based microservices from source control on GitHub, to a fully tested set of integrated Docker containers running within an Oracle VirtualBox VM. We performed integration tests, using a scripted set of synthetic transactions, to make sure the microservices were functioning as expected, within their containers.

In this post, we will round out our Virtual-Vehicles microservices REST API project by adding log aggregation, visualization, analysis, and monitoring, using the ELK Stack (Elasticsearch, Logstash, and Kibana) and Logspout.

ELK Stack 3D Diagram

All code for this post is available on GitHub, release version v3.1.0 on the ‘master’ branch (after running ‘git clone …’, run a ‘git checkout tags/v3.1.0’ command).

Logging

If you’re using Docker, then you’re familiar with the command, ‘docker logs container-name command‘. This command streams the log output of running services within a container, commonly used to debugging and troubleshooting. It sure beats ‘docker exec -it container-name cat /var/logs/foo/foo.log‘ and so on, for each log we need to inspect within a container.

With Docker Compose, we gain the command, ‘docker-compose logs‘. This command stream the log output of running services, of all containers defined in our ‘docker-compose.yml‘ file. Although moderately more useful for debugging, I’ve also found it fairly buggy when used with Docker Machine and Docker Swarm.

As helpful as these type of Docker commands are, when you start scaling from one container, to ten containers, to hundreds of containers, individually inspecting container logs from the command line is time-consuming and of little value. Correlating log events between containers is impossible. That’s where solutions such as the ELK Stack and Logspout really shine for containerized environments.

ELK Stack

Although not specifically designed for the purpose, the ELK Stack (Elasticsearch, Logstash, and Kibana) is an ideal tool-chain for log aggregation, visualization, analysis, and monitoring. Individually setting up Elasticsearch, Logstash, and Kibana, and configuring them to communicate with each other is not a small task. Luckily, there are several ready-made Docker images on Docker Hub, whose authors have already done much of the hard work for us. After trying several ELK containers on Docker Hub, I chose on the willdurand/elk image. This image is easy to get started with, and is easily used to build containers using Docker Compose.

Logspout

Using the ELK Stack, we have a way to collect (Logstash), store and search (Elasticsearch), and visualize and analyze (Kibana) our container’s log events. Although Logstash is capable of collecting our log events, to integrate more easily with Docker, we will add another component, Glider Lab’s Logspout, to our tool-chain. Logspout advertises itself as “a log router for Docker containers that runs inside Docker. It attaches to all containers on a host, then routes their logs wherever you want. It also has an extensible module system.”

Since Logspout is extensible through third-party modules, we will use one last component, Loop Lab’s Logspout/Logstash Adapter. Written in the go programming language, the adapter is described as “a minimalistic adapter for Glider Lab’s Logspout to write to Logstash UDP”. This adapter will allow us to collect Docker’s log events with Logspout and send them to Logstash using User Datagram Protocol (UDP).

In order to use the Logspout/Logstash adapter, we need to build a Logspout container from the /logspout Docker image, which contains a customized version of Logspout’s modules.go configuration file. This is explained in the Custom Logspout Builds section of Logspout’s README.md. Below is the modified configuration module with the addition of the adapter (see last import statement).

package main

import (
  _ "github.com/gliderlabs/logspout/adapters/raw"
  _ "github.com/gliderlabs/logspout/adapters/syslog"
  _ "github.com/gliderlabs/logspout/httpstream"
  _ "github.com/gliderlabs/logspout/routesapi"
  _ "github.com/gliderlabs/logspout/httpstream"
  _ "github.com/gliderlabs/logspout/transports/udp"
  _ "github.com/looplab/logspout-logstash"
)

One note with Logspout, according to their website, for now it Logspout only captures stdout and stderr, but a module to collect container syslog is planned. Although syslog is common centralized log collection method, the Docker logs we will collect are sent to stdout and stderr, the lack of syslog support is not a limitation for us, in this demonstration.

We will configure Logstash to accept log events from Logspout, using UDP on port 5000. Below is an abridged version of the logstash-logspout-log4j2.conf configuration file. The except from the configuration file, below, instructs Logstash to listen for Logspout’s messages over UDP on port 5000, and passes them to Elasticsearch.

input {
  udp {
    port  => 5000
    codec => json
    type  => "dockerlogs"
  }
 
# filtering section not shown...
 
output {
  elasticsearch { protocol => "http" }
  stdout { codec => rubydebug }
}

We could spend several posts on the configuration of Logstash. There are an infinite number of input, filter, and output combinations, to collect, transform, and push log events to various programs, including Logstash. The filtering section alone takes some time to learn exactly how to filter and transform log events, based upon the requirements for visualization and analysis.

Apache Log4j Logs

What about our Virtual-Vehicle microservice’s Log4j 2 logs? In the previous posts, you’ll recall we were sending our log events to physical log files within each container, using Log4j’s Rolling File appender.

<Appenders>
    <RollingFile name="RollingFile" fileName="${log-path}/virtual-authentication.log"
                 filePattern="${log-path}/virtual-authentication-%d{yyyy-MM-dd}-%i.log" >
        <PatternLayout>
            <pattern>%d{dd/MMM/yyyy HH:mm:ss,SSS}- %c{1}: %m%n</pattern>
        </PatternLayout>
        <Policies>
            <SizeBasedTriggeringPolicy size="1024 KB" />
        </Policies>
        <DefaultRolloverStrategy max="4"/>
    </RollingFile>
</Appenders>

Given the variety of appenders available with Log4j 2, we have a few options to leverage the ELK Stack with these logs events. The least disruptive change would be to send the Log4j log events to Logspout by redirecting Log4j output from the physical log file to stdout. We could do this by running a Linux link command in each microservice’s Dockerfile, as in the following example with Authentication microservice.

RUN touch /var/log/virtual-authentication.log && \
    ln -sf /dev/stdout /var/log/virtual-authentication.log

This method would not require us to change the log4j2.xml configuration files, and rebuild the services. However, the alternative we will use in this post is switching to Log4j’s Syslog appender. According to Log4j documentation, the Syslog appender is a Socket appender that writes its output to a remote destination specified by a host and port in a format that conforms with either the BSD Syslog format or the RFC 5424 format. The data can be sent over either TCP or UDP.

To use the Syslog appender option, we do need to change each log4j2.xml configuration file, and then rebuild each of the microservices. Instead of using UDP over port 5000, which is the port Logspout is currently using to communicate with Logstash, we will use UDP over port 5001. Below is a sample of the log4j2.xml configuration files for the Authentication microservice.

<Appenders>
    <Syslog name="RFC5424" format="RFC5424" host="elk" port="5001"
            protocol="UDP" appName="virtual-authentication" includeMDC="true"
            facility="SYSLOG" enterpriseNumber="18060" newLine="true"
            messageId="log4j2" mdcId="mdc" id="App"
            connectTimeoutMillis="1000" reconnectionDelayMillis="5000">
        <LoggerFields>
            <KeyValuePair key="thread" value="%t"/>
            <KeyValuePair key="priority" value="%p"/>
            <KeyValuePair key="category" value="%c"/>
            <KeyValuePair key="exception" value="%ex"/>
            <KeyValuePair key="message" value="%m"/>
        </LoggerFields>
    </Syslog>
</Appenders>

To communicate with Logstash over port 5001 with the Syslog appender, we also need to modify the logstash-logspout-log4j2.conf configuration file, again. Below is the unabridged version of the configuration file, with both the Logspout (UDP port 5000) and Log4j (UDP port 5001) configurations.

input {
  udp {
    port  => 5000
    codec => json
    type  => "dockerlogs"
  }

  udp {
    type => "log4j2"
    port => 5001
  }
}

filter {
  if [type] == "log4j2" {
    mutate {
     gsub => ['message', "\n", " "]
     gsub => ['message', "\t", " "]
    }
  }

  if [type] == "dockerlogs" {
    if ([message] =~ "^\tat ") {
      drop {}
    }

    grok {
      break_on_match => false
      match => [ "message", " responded with %{NUMBER:status_code:int}" ]
      tag_on_failure => []
    }

    grok {
      break_on_match => false
      match => [ "message", " in %{NUMBER:response_time:int}ms" ]
      tag_on_failure => []
    }
  }
}

output {
  elasticsearch { protocol => "http" }
  stdout { codec => rubydebug }
}

You will note some basic filtering in the configuration. I will touch upon this in the next section. Below is a diagram showing the complete flow of log events from both Log4j and from the Docker containers to Logspout and the ELK Stack.ELK Log Message Flow

Troubleshooting and Debugging

Trying to troubleshoot why log events may not be showing up in Kibana can be frustrating, without methods to debug the flow of log events along with way. Were the stdout Docker log events successfully received by Logspout? Did Logspout successfully forward the log events to Logstash? Did Log4j successfully push the microservice’s log events to Logstash? Probably the most frustrating of all issues, did you properly configure the Logstash configuration file(s) to receive, filter, transform, and push the log events to Elasticsearch. I spent countless hours debugging filtering, alone. Luckily, there are several ways to ensure log events are flowing. The below diagram shows some of the debug points along the way.

ELK Ports

First, we can check that the log events are making to Logspout from Docker by cURLing or browsing port 8000. Executing ‘curl -X GET --url http://api.virtual-vehicles.com:8000/logs‘ will tail incoming log events received to Logspout. You should see log events flowing into Logspout as you call the microservices through NGINX, by running the project’s integration tests, as shown in the example, below.

Logspout Debugging

Second, we can cURL or browse port 9200. This port will display information about Elasticsearch. There are several useful endpoints exposed by Elasticsearch’s REST API interface. Executing ‘curl -X GET --url http://api.virtual-vehicles.com:9200/_status?pretty‘ will display statistics about Elasticsearch, including the number of log events, referred to as ‘documents’ to Elasticsearch’s structured JSON document-based NoSQL datastore. Note the line, ‘"num_docs": 469‘, indicating 469 log events were captured by Elasticsearch as documents.

{
    "_shards": {
        "total": 32,
        "successful": 16,
        "failed": 0
    },
    "indices": {
        "logstash-2015.08.01": {
            "index": {
                "primary_size_in_bytes": 525997,
                "size_in_bytes": 525997
            },
            "translog": {
                "operations": 492
            },
            "docs": {
                "num_docs": 469,
                "max_doc": 469,
                "deleted_docs": 0
            }
        }
    }
}

If you find log events are not flowing into Logstash, a quick way to start debugging issues is to check Logstash’s log:

docker exec -it jenkins_elk_1 cat /var/log/logstash/stdout.log

If you find log events are flowing into Logstash, but not being captured by Elasticsearch, it’s probably your Logstash configuration file. Either the input, filter, and/or output sections are wrong. A quick way to debug these types of issues is to check Elasticsearch’s log. I’ve found this log often contains useful and specific error messages, which can help fix Logstash configuration issues.

docker exec -it jenkins_elk_1 cat /var/log/elasticsearch/logstash.log

Without log event documents in Elasticsearch, there is no sense moving on to Kibana. Kibana will have no data available to display.

Kibana

If you recall from our last post, the project already has Graphite and StatsD configured and running, as shown below. On its own, Graphite provides important monitoring and performance information about our microservices. In fact, we could choose to also send all our Docker log events, through Logstash, to Graphite. This would require some additional filtering and output configuration.

Graphite Dashboard

However, our main interest in this post is the ELK Stack. The way we visualize and analyze the log events we have captured is through Kibana. Kibana resembles other popular log aggregators and log search and analysis products, like Splunk, Graylog, and Sumo Logic. I suggest you familiarize yourself with Kibana before diving into the this part of the demonstration. Kibana can be confusing at first, if you are not familiar with it’s indexing, discovery, and search features.

We can access Kibana from our browser, at port 8200, ‘http://api.virtual-vehicles.com:8200‘. The first interactions with Kibana will be through the Discover view, as seen in the screen grab shown below. Kibana displays the typical vertical bar chart event timeline, based on log event timestamps. The details of each log event are displayed below the timeline. You can filter and search within this view. Searches can be saved and used later.

Kibana Discovery Tab

Heck, just the ability to view and search all our log events in one place is a huge improvement over the command line. If you look a little closer at the actual log events, as shown below, you will notice two types, ‘dockerlogs‘ and ‘log4j2‘. Looking at the Logstash configuration file again, shown previously, you see we applied the ‘type‘ tag to the log events as they were being processed by Logstash.

Kibana Discovery Message Types

In the Logstash configuration file, shown previously, you will also note the use of a few basic filters. I created a ‘status_code‘ and ‘response_time‘ filter, specifically for the Docker log events. Each Docker log event is passed through the filters. The two fields, ‘status_code‘ and  ‘response_time‘, are extracted from the main log event text and added as separate, indexable, and searchable fields. Below is an example of one such Docker log event, an HTTP DELETE call to the Valet microservice, shown as JSON. Note the two fields, showing a response time of 13ms and a http status code of 204.

{
  "_index": "logstash-2015.08.01",
  "_type": "dockerlogs",
  "_id": "AU7rcyxTA4OY8JukKyIv",
  "_score": null,
  "_source": {
    "message": "DELETE http://api.virtual-vehicles.com/valets/55bd30c2e4b0818a113883a6 
                responded with 204 No Content in 13ms",
    "docker.name": "/jenkins_valet_1",
    "docker.id": "7ef368f9fdca2d338786ecd8fe612011aebbfc9ad9b677c21578332f7c46cf2b",
    "docker.image": "jenkins_valet",
    "docker.hostname": "7ef368f9fdca",
    "@version": "1",
    "@timestamp": "2015-08-01T22:47:49.649Z",
    "type": "dockerlogs",
    "host": "172.17.0.7",
    "status_code": 204,
    "response_time": 13
  },
  "fields": {
    "@timestamp": [
      1438469269649
    ]
  },
  "sort": [
    1438469269649
  ]
}

For comparison, here is a sample Log4j 2 log event, generated by a JsonParseException. Note the different field structure. With more time spent modifying the Log4j event format, and configuring Logstash’s filtering and transforms, we could certainly improve the usability of Log4j log events.

{
  "_index": "logstash-2015.08.02",
  "_type": "log4j2",
  "_id": "AU7wJt8zA4OY8JukKyrt",
  "_score": null,
  "_source": {
    "message": "<43>1 2015-08-02T20:42:35.067Z bc45ce804859 virtual-authentication - log4j2
                [mdc@18060 category=\"com.example.authentication.objectid.JwtController\" exception=\"\"
                message=\"validateJwt() failed: JsonParseException: Unexpected end-of-input: was expecting closing
                quote for a string value  at [Source: java.io.StringReader@12a24457; line: 1, column: 27\\]\"
                priority=\"ERROR\" thread=\"nioEventLoopGroup-3-9\"] validateJwt() failed: JsonParseException:
                Unexpected end-of-input: was expecting closing quote for a string value  at [Source:
                java.io.StringReader@12a24457; line: 1, column: 27] ",
    "@version": "1",
    "@timestamp": "2015-08-02T20:42:35.188Z",
    "type": "log4j2",
    "host": "172.17.0.9"
  },
  "fields": {
    "@timestamp": [
      1438548155188
    ]
  },
  "sort": [
    1438548155188
  ]
}

Kibana Dashboard

To demonstrate the visualization capabilities of Kibana, we will create a Dashboard. Our Dashboard will be composed of a series of Kibana Visualizations. Visualizations are charts, graphs, tables, and metrics, based on the log events we see in the Discovery view. Below, I have created a rather basic Dashboard, containing some simple data visualization, based on our Docker and Log4j log events, collected over a 1-hour period. This one small screen-grab does not begin to do justice to the real power of Kibana.

Kibana Dashboard

In the dashboard above, you see a few basic metrics, such as request response times, response http status code, a chart of which containers are logging events, a graph that shows log events captured per minute, and so forth. Along with Searches, Visualizations and Dashboards can also be saved in Kibana. Note this demonstration’s Docker Compose YAML file does not configure volume mapping between the containers and host. If you destroy the containers, you destroy anything you saved in Kibana.

A key feature of Kibana’s Dashboards is their interactive capabilities. Rolling over any piece of a Visualization brings up an informative pop-up with additional details. For example, as shown below, rolling over the http status code ‘500’ pie chart slice, pops up the number of status code 500 responses. In this case, 15 log events, or 1.92% of the total 2,994 log events captured, had a ‘status_code’ field of ‘500’, within the 24-hour period the Dashboard analyzed.

Kibana Dashboard with Popup

Conveniently, Kibana also allows you to switch from a visual mode to a data table mode, for any Visualization on the Dashboard, as shown below, for a 24-hour period.

Kibana Dashboard as Tables

Conclusion

The ELK Stack is just one of a number of enterprise-class tools available to monitor and analyze the overall health of your applications running within a Dockerized environment. Having well planned logging, monitoring, and analytics strategies is key to this type of project. They should be implemented from the beginning of the project, to increase development and testing velocity, as well as provide quick troubleshooting, key business metrics, and proactive monitoring, once the application is in production.

Notes on Running the GitHub Project

If you download and run this project from GitHub, there is two key steps you should note. First, you need add an entry to your local /etc/hosts file. The IP address will be that of the Docker Machine VM, ‘test’. The hostname is ‘api.virtual-vehicles.com’. which matches the one I used throughout the demo. You should run the following bash command before building your containers from the docker-compose.yml file, but after you have built your VM using Docker Machine. The ‘test’ VM must already exist.

echo "$(docker-machine ip test)   api.virtual-vehicles.com" \
  | sudo tee --append /etc/hosts

If you want to override this domain name with your own, you will need to modify and re-build the microservices project, first. Then, copy those build artifacts into this project, replacing the ones you pulled from GitHub.

Second, in order to achieve HATEOAS in my REST API responses, I have included some variables in my docker-compose.yml file. Wait, docker-compose.yml doesn’t support variables? Well, it can if you use a template file (docker-compose-template.yml) and run a script (compose_replace.sh) to provide variable expansion. My gist explains the technique a little better.

You should also run this command before building your containers from the docker-compose.yml file, but after you have built your VM using Docker Machine. Again, the ‘test’ VM must already exist.

sh compose_replace.sh

Lastly, remember, we can run our integration tests to generate log events, using the following command.

sh tests_color.sh api.virtual-vehicles.com

Integration Tests

, , , , , , , , , , , , , , , , ,

1 Comment

Building a Microservices-based REST API with RestExpress, Java EE, and MongoDB: Part 1

Develop a well-architected and well-documented REST API, built on a tightly integrated collection of Java EE-based microservices.

Generic API Architecture

Microservices

Microservices are a popular and growing trend in software development. According to Wikipedia, microservices are “a software architecture style, in which complex applications are composed of small, independent processes communicating with each other using language-agnostic APIs. These services are small, highly decoupled and focus on doing a small task.

Martin Fowler and James Lewis (ThoughtWorks) have done an exemplary job capturing the essence of microservice architecture in their March 2014 post, microservices. Fowler has also discussed these principles in several presentations, including the January 2015 goto; Conference, Keynote: Microservices by Martin Fowler.

Additionally, noted technical consultant and speaker, Adrian Cockcroft (Battery Ventures), has made significant contributions to the definition of microservices, such as in his December 2014 dockercon14 | eu presentation, State of the Art in Microservices.

Lastly, Zhamak Dehghani (ThoughtWorks), delivered an in-depth discussion of microservices, including customer perspectives, in her October 2014 presentation, Real-World Microservices: Lessons from the Frontline.

Some of the major characteristics of microservices and REST cited by these experts, include:

  • Organized Around Business Capabilities
  • Single Responsibility
  • Loose Coupling / High Cohesion
  • Smart Endpoints and Dumb Pipes
  • Decentralized Data Management
  • Hypermedia as the Engine of Application State (HATEOAS)

As we develop this post’s example, I will demonstrate how all of the above characteristics are implemented.

REST API

A REST API is the mash-up of two common software concepts, Representational State Transfer (REST) and an application programming interface (API). Although even Wikipedia doesn’t have an exact definition of a REST API, they come close to their discussion of REST. According to Wikipedia“Web service APIs that adhere to the REST architectural constraints are called RESTful APIs. HTTP-based RESTful APIs are defined with these aspects: base URI, an Internet media type for the data, standard HTTP methods, hypertext links to a reference state, and hypertext links to reference related resources.

An important nuance and differentiator from SOA-based APIs, RESTful APIs do not require XML-based Web service protocols (SOAP and WSDL) to support their interfaces (Wikipedia).

The author of the WebConcepts channel does an excellent job capturing the essence of REST APIs in REST API concepts and examples. Two additional presentations I strongly recommend are REST+JSON API Design – Best Practices for Developers and Designing a Beautiful REST+JSON API, both by Les Hazlewood, CTO of Stormpath. Stormpath is a leader in the commercial REST API space.

Microservices-Based REST API

A microservices-based REST API is a REST API, whose HTTP requests call an orchestrated collection or collections of language-agnostic and platform-agnostic microservices. The combination of these two trends, microservices, and a REST API, offers a simple, reliable, and scalable solution for providing flexible functionality to an end-user, in a technology-agnostic manner.

REST API Example

There is a fast-growing volume of reference materials describing the characteristics, benefits, and general architecture of microservices and REST APIs. However, in researching these topics, I have found a shortage of practical examples or tutorials on building microservices-based REST API solutions.

Undoubtedly, the complexity of even the simplest microservices-based solution limits the number of available cases. A minimally viable solution require planning, coding, testing, and documentation. The addition of cross-cutting features such as security, logging, monitoring, and orchestration, creates an enormous task to build a practical microservices-based example.

In the following series of posts, we will use many of the characteristics of a modern microservice architecture as described by Fowler, Lewis, Cockcroft, and Dehghani. We will combine these microservice characteristics with the best practices of good REST API design, as described by Hazlewood and WebConcepts, to build a minimally viable microservices-based REST API.

In a future post, we will create an application, which leverages the microservices-based solution, through the REST API. Additionally, we will demonstrate how to ensure high-availability of the individual microservices and data sources.

Vehicle for Learning

In a similar vein to the publicly available Twitter, Facebook, and Google REST APIs, we will build the Virtual-Vehicles REST API. The Virtual-Vehicles REST API will constitute a collection of vehicle-themed microservices. Collectively, the microservices will offer a comprehensive set of functionality to the end-user, an application developer. They, in turn, will use the functionality of the Virtual-Vehicles REST API to build applications and games for their end-users.

Technology Choices

There are a seemingly infinite number of technology choices for building microservices and REST APIs. Your choice of development languages, databases, application servers, third-party libraries, API gateway, logging, monitoring, automated testing, ORM or ODM, and even the IDE, all define your technology stack.

For the Virtual-Vehicles solution, we will use the following key technologies:

What is RestExpress?

According to their website, RestExpress, composes best-of-breed open-source tools to enable quickly creating RESTful microservices that embrace industry best practices. Built from the ground-up for container-less, microservice architectures, RestExpress is the easiest way to create RESTful APIs in Java. RestExpress is an extremely lightweight, fast, REST engine and API for Java. RestExpress is a thin wrapper on Netty IO HTTP handling. RestExpress lets you create performant, stand-alone REST APIs rapidly. RestExpress provides several Maven archetypes, which we will use as a basis for our microservices.

RestExpress will also drive our technology decisions to use Java EE, Maven, MongoDB, and Netty.

Virtual-Vehicles Microservices

Adhering to the first few microservice architectural principles listed above, organized around business capabilities, single responsibility, and high cohesion, we first must determine the proper number of microservices, and their individual responsibilities. In the case of our solution, we will break down Virtual-Vehicles’ business capabilities into the following microservices:

Virtual-Vehicles Services

Microservice Purpose (Business Capability)
Authentication Service Manage API clients and JWT authentication
Vehicle Service Manage virtual vehicles
Maintenance Service Manage maintenance on vehicles
Valet Parking Service Manage a valet service to park for vehicles
Sales Service Manage the buying and selling of vehicles
Registration Service Manage registration of vehicles to owners
Auction Service Manage a virtual car auction
Car Show Service Manage a virtual car show
Interaction Service Manage interaction of users with vehicles

For simplicity in this post’s example, we will only be exploring the (4) services shown above in bold.

This segmentation of service functionality is unlike what we might encounter in traditional monolithic, n-tier applications, and SOA-based architecture. Traditional applications were built around application-centric functionality or business’ organizational structure. Microservices, however, are client-centric and built around business capabilities.

REST API Functionality

The next decision we need to make is required functionality. What are the operational requirements of each business segment, represented by the microservices? Additionally, what are the nonfunctional requirements, such as monitoring, logging, and authentication. Requirements are translated into functionality, which is translated into the available resources exposed via the service’s RESTful endpoints.

For the Virtual-Vehicles microservices solution, based on a hypothetical set of business and non-functional requirements, we will expose the following resources. Collectively, they will compose the REST API:

Virtual-Vehicles REST API Resources

Microservice Purpose (Business Capability) Functions
Authentication Service Manage API clients and
JWT authentication
  • Create a new API client (public)
  • Read, filter, sort, count, paginate API clients (admin)
  • Read a single API client (public)
  • Update an existing API client (public)
  • Delete an existing API client (admin)
  • Create new JWT (public)
  • Validate a JWT (internal)
  • Service health ping (admin)
Vehicle Service Manage virtual vehicles
  • Create a new vehicle (public)
  • Read, filter, sort, count, paginate vehicles (admin)
  • Read a single vehicle (public)
  • Update an existing vehicle (public)
  • Delete an existing vehicle (admin)
  • Validate a JWT (internal)
  • Service health ping (admin)
Maintenance Service Manage maintenance on vehicles
  • Create a new maintenance record (public)
  • Read, filter, sort, count, paginate maintenance records (admin)
  • Read a single maintenance record (public)
  • Update an existing maintenance record (public)
  • Delete an existing maintenance record (admin)
  • Validate a JWT (internal)
  • Service health ping (admin)
Valet Parking Service Manage a valet service to park for vehicles
  • Create a new valet parking transaction (public)
  • Read, filter, sort, count, paginate valet parking transactions (admin)
  • Read a single valet parking transaction (public)
  • Update an existing valet parking transaction (public)
  • Delete an existing valet parking transaction (admin)
  • Validate a JWT (internal)
  • Service health ping (admin)

Reviewing the table above, note the first five functions for each service are all basic CRUD operations: create (POST), read (GET), readAll (GET), update (PUT), delete (DELETE). The readAll function also has find, count, and pagination functionality using query parameters.

All services also have an internal authenticateJwt function, to authenticate the JWT, passed in the HTTP request header, before performing any operation. Additionally, all services have a basic health-check function, ping (GET). There are only a few other functions required for our Virtual-Vehicles example, such as for creating JWTs.

I’ve labeled each function as to suggested user scope. Scopes include public, admin, and internal. As a consumer of the REST API, you may only want to expose certain functionality to your general end-user (public). Additional functionality may be reserved for an administrative user (admin) or only yourself as a developer (internal). Creating a new vehicle might be a common end-user feature. However, the ability to permanently delete one or more vehicles may be reserved for an admin-level user, or not exposed at all.

REST API Patterns

We will not spend a lot of time discussing patterns for building REST APIs. There are many useful materials available on the Internet regarding industry-standard patterns for REST API resource URI construction. The two presentations I recommend above by Les Hazlewood, CTO of Stormpath, are excellent. Also, Microservices.ioRestApiTutorial.com, swagger.io, and raml.org websites offer solid overviews of REST patterns and RESTful standards.

A common RESTful anti-pattern, which is hard to avoid as a OOP developer, is the temptation to use verbs versus nouns and method-like names, in resource URIs. Remember, we are not designing an end-user application. We are building an API, used by API consumers (application developers), to build a variety of platform and language-agnostic applications. Functions like paintCar, changeOil, or parkVehicle are not something the API should define. The Vehicle microservice exposes the update operation, which allows an application developer to change the car’s paint color in their paintCar method. Similarly, the valet service exposes the create operation, which allows the application developer to create a function to park the vehicle (or car, or truck, in a garage, or parking lot, etc.). A good REST API allows for maximum end-user flexibility.

Part Two

In Part Two, we will install a copy of the Virtual-Vehicles project from GitHub. In Part Two, we will gain a basic understanding of how RestExpress works. Finally, we will discover how to get the Virtual-Vehicles microservices up and running.

Virtual-Vehicles Architecture

, , , , , , , , , , , , ,

1 Comment