Posts Tagged ECR

Cross-Account Amazon Elastic Container Registry (ECR) Access for ECS

Deploying containerized applications on Amazon ECS using cross-account elastic container registries

This is an updated version of a post, originally published in October 2019. This post uses AWS CLI version 2 and contains updated versions of all Docker images.

Introduction

There are two scenarios I frequently encounter that require sharing Amazon Elastic Container Registry (ECR)-based Docker images across multiple AWS Accounts. In the first scenario, a vendor wants to share a Docker image with their customer, stored in the vendor’s private container registry. Many popular container security and observability solutions function in this manner.

Below, we see an example of an application consisting of three containers. Two of the container images originated from the customer’s own ECR repositories (right side). The third image originated from their vendor’s ECR repository (left side).

In the second scenario, an enterprise operates multiple AWS accounts to create logical security boundaries between environments and responsibilities. The first AWS account contains the enterprise’s deployable assets, including their ECR image repositories. The enterprise has additional accounts, such as Development, Test, Staging, and Production, for each Software Development Life Cycle (SDLC) phase. The ECR images in the repository account need to be accessed from multiple AWS accounts and often across different AWS Regions for deployment.

Below, we see an example of a deployed application also consisting of three containers. All the container images originated from the ECR repositories account (left side). The images were pulled into the Production account during deployment to ECS (right side).

This post will explore the first scenario — a vendor who wants to share a private Docker image with their customer securely. The post will demonstrate how to share images across AWS accounts for use with Docker Swarm and Amazon Elastic Container Service (ECS) with AWS Fargate, both using ECR Repository Policies.

For the demonstration, we will use an existing application I have created, a RESTful, HTTP-based NLP (Natural Language Processing) API, consisting of four Golang microservices. The edge service, nlp-client, communicates with the rake-app, lang-app, and prose-app services. There is a fifth service, dyanmo-app, which is not discussed in this post, but easily added to the API.

A customer has developed the nlp-client, lang-app, and prose-app container-based microservices as part of their NLP application in the post’s scenario. Instead of developing their own implementation of the RAKE (Rapid Automatic Keyword Extraction) algorithm, they have licensed a version from a vendor. The vendor’s rake-app service is delivered in the form of a licensed Docker image. The acronym ISV (Independent Software Vendor) is used to represent the vendor throughout the code.

The NPL API exposes several endpoints accessible through the nlp-client service. The endpoints perform common NLP operations on text input, such as extracting keywords, tokens, entities, and sentences, and determining the language. All the endpoints can be listed using the /routes endpoint.

[
{
"method": "GET",
"path": "/error",
"name": "main.getError"
},
{
"method": "POST",
"path": "/keywords",
"name": "main.getKeywords"
},
{
"method": "POST",
"path": "/language",
"name": "main.getLanguage"
},
{
"method": "GET",
"path": "/health",
"name": "main.getHealth"
},
{
"method": "GET",
"path": "/health/:app",
"name": "main.getHealthUpstream"
},
{
"method": "GET",
"path": "/routes",
"name": "main.getRoutes"
},
{
"method": "POST",
"path": "/tokens",
"name": "main.getTokens"
},
{
"method": "POST",
"path": "/entities",
"name": "main.getEntities"
},
{
"method": "POST",
"path": "/sentences",
"name": "main.getSentences"
}
]

Requirements

To follow along with the post’s demonstration, you will need two AWS accounts, one representing the vendor and one representing one of their customers. It is relatively simple to create additional AWS accounts — all you need is a unique email address (easy with Gmail) and a credit card. Using AWS Organizations can make the task of creating and managing multiple accounts even easier.

I have intentionally used different AWS Regions to demonstrate how you can share ECR images across both AWS accounts and regions. You will need a current version of the AWS CLI version 2 and of Docker. Lastly, you will need adequate access to each AWS account to create resources.

Source Code

The demonstration’s source code is contained in five public GitHub repositories.

git clone --branch v2.0.0 \
    --single-branch --depth 1 \
    https://github.com/garystafford/ecr-cross-account-demo.git
git clone --branch master \
    --single-branch --depth 1 \
    https://github.com/garystafford/nlp-client.git
git clone --branch master \
    --single-branch --depth 1 \
    https://github.com/garystafford/prose-app.git
git clone --branch master \
    --single-branch --depth 1 \
    https://github.com/garystafford/rake-app.git
git clone --branch master \
    --single-branch --depth 1 \
    https://github.com/garystafford/lang-app.git

The v2.0.0 branch of the ecr-cross-account-demo GitHub repository contains all the CloudFormation templates and the Docker Compose Stack file.

.
├── LICENSE
├── README.md
├── cfn-templates
│ ├── development-user-group-customer.yml
│ ├── development-user-group-isv.yml
│ ├── ecr-repo-not-shared.yml
│ ├── ecr-repo-shared.yml
│ ├── public-subnet-public-loadbalancer.yml
│ └── public-vpc.yml
└── docker
└── stack.yml

Each of the other four GitHub repositories, such as the nlp-client repository, contains a Golang-based microservice, which together comprises the NLP API. Each repository also contains a Dockerfile.

.
├── Dockerfile
├── LICENSE
├── README.md
├── buildspec.yml
├── go.mod
├── go.sum
└── main.go

We will use AWS CloudFormation to create the necessary resources within both AWS accounts. We will also use CloudFormation to create an ECS Cluster and an Amazon ECS Task Definition for the customer account. Task Definition defines how ECS will deploy the application, consisting of four Docker containers, using AWS Fargate. In addition to ECS, we will create an Amazon Virtual Private Cloud (VPC) to house the ECS cluster and a public-facing, Layer 7 Application Load Balancer (ALB) to load-balance our ECS-based application.

Creating ECR Repositories

In the first AWS account, representing the vendor, we will execute two CloudFormation templates. The first template, development-user-group-isv.yml, creates the Development group and VendorDev user. The VendorDev user will be given explicit access to the vendor’s rake-app ECR repository. Change the DevUserPassword parameter’s value to something more secure.

# change me
export ISV_ACCOUNT=111222333444
export ISV_ECR_REGION=us-east-2
export IAM_USER_PSWD=T0pS3cr3Tpa55w0rD
aws --region ${ISV_ECR_REGION} cloudformation create-stack \
--stack-name development-user-group-isv \
--template-body file://cfn-templates/development-user-group-isv.yml \
--parameters \
ParameterKey=DevUserPassword,ParameterValue=${IAM_USER_PSWD} \
--capabilities CAPABILITY_NAMED_IAM

Below, we see an example of the resulting CloudFormation Stack showing the new IAM Group and User.

Next, we will execute the second CloudFormation template, ecr-repo-shared.yml, which creates the vendor’s rake-app ECR image repository. The rake-app repository will house a copy of the vendor’s rake-app Docker Image. But first, let’s look at the CloudFormation template used to create the repository, specifically the RepositoryPolicyText section. Here we define two repository policies:

  • The AllowPushPull policy explicitly allows the VendorDev user to push and pull versions of the image to the ECR repository. We import the exported Amazon Resource Name (ARN) of the VendorDev user from the previous CloudFormation Stack Outputs. We have also allowed AWS CodeBuild service access to the ECR repository. This is known as a Service-Linked Role. We will not use CodeBuild in this brief post.
  • The AllowPull policy allows anyone in the customer’s AWS account (root) to pull any version of the image. They cannot push, only pull. Cross-account access can be restricted to a finer-grained set of the specific customer’s IAM Entities and source IP addresses.

Note the "ecr:GetAuthorizationToken" policy Action. This action will allow the customer’s user to log into the vendor’s ECR repository and receive an Authorization Token. The customer retrieves a token that is valid for a specified container registry for 12 hours.

RepositoryPolicyText:
Version: '2012-10-17'
Statement:
- Sid: AllowPushPull
Effect: Allow
Principal:
Service: codebuild.amazonaws.com
AWS:
Fn::ImportValue:
!Join [':', [!Ref 'StackName', 'DevUserArn']]
Action:
- 'ecr:BatchCheckLayerAvailability'
- 'ecr:BatchGetImage'
- 'ecr:CompleteLayerUpload'
- 'ecr:DescribeImages'
- 'ecr:DescribeRepositories'
- 'ecr:GetDownloadUrlForLayer'
- 'ecr:GetRepositoryPolicy'
- 'ecr:InitiateLayerUpload'
- 'ecr:ListImages'
- 'ecr:PutImage'
- 'ecr:UploadLayerPart'
- Sid: AllowPull
Effect: Allow
Principal:
AWS: !Join [':', ['arn:aws:iam:', !Ref 'CustomerAccount', 'root']]
Action:
- 'ecr:GetAuthorizationToken'
- 'ecr:BatchCheckLayerAvailability'
- 'ecr:GetDownloadUrlForLayer'
- 'ecr:BatchGetImage'
- 'ecr:DescribeRepositories' # optional permission
- 'ecr:DescribeImages' # optional permission

Before executing the following command to deploy the CloudFormation Stack, ecr-repo-shared.yml, replace the CUSTOMER_ACCOUNT value with your pseudo customer’s AWS account ID.

# change me
export CUSTOMER_ACCOUNT=999888777666
export CUSTOMER_ECR_REGION=us-west-2
# NLP Rake Microservice
REPO_NAME=rake-app
aws --region ${ISV_ECR_REGION} cloudformation create-stack \
--stack-name ecr-repo-${REPO_NAME} \
--template-body file://cfn-templates/ecr-repo-shared.yml \
--parameters \
ParameterKey=CustomerAccount,ParameterValue=${CUSTOMER_ACCOUNT} \
ParameterKey=RepoName,ParameterValue=${REPO_NAME} \
--capabilities CAPABILITY_NAMED_IAM

Below, we see an example of the resulting CloudFormation Stack showing the new ECR repository.

Below, we see the ECR repository policies applied correctly in the Permissions tab of the rake-app repository. The first policy covers both the VendorDev user, referred to as an IAM Entity, as well as AWS CodeBuild, referred to as a Service Principal.

The second policy covers the customer’s AWS account ID.

Repeat this process in the customer’s AWS account. First, the CloudFormation template, development-user-group-customer.yml, containing the Development group and CustomerDev user.

# change me
export IAM_USER_PSWD=T0pS3cr3Tpa55w0rD
aws --region ${CUSTOMER_ECR_REGION} cloudformation create-stack \
--stack-name development-user-group-customer \
--template-body file://cfn-templates/development-user-group-customer.yml \
--parameters \
ParameterKey=DevUserPassword,ParameterValue=${IAM_USER_PSWD} \
--capabilities CAPABILITY_NAMED_IAM

Next, we will execute the second CloudFormation template, ecr-repo-not-shared.yml, three times, once for each of the customer’s three ECR repositories, nlp-client, lang-app, and prose-app. Note that in the RepositoryPolicyText section of the template we only define a single policy. Identical to the vendor’s policy, the AllowPushPull policy explicitly allows the previously-created CustomerDev user to push and pull versions of the image to the ECR repository. There is no cross-account access required to the customer’s two ECR repositories.

RepositoryPolicyText:
Version: '2012-10-17'
Statement:
- Sid: AllowPushPull
Effect: Allow
Principal:
Service: codebuild.amazonaws.com
AWS:
Fn::ImportValue:
!Join [':', [!Ref 'StackName', 'DevUserArn']]

Action:
- 'ecr:BatchCheckLayerAvailability'
- 'ecr:BatchGetImage'
- 'ecr:CompleteLayerUpload'
- 'ecr:DescribeImages'
- 'ecr:DescribeRepositories'
- 'ecr:GetDownloadUrlForLayer'
- 'ecr:GetRepositoryPolicy'
- 'ecr:InitiateLayerUpload'
- 'ecr:ListImages'
- 'ecr:PutImage'
- 'ecr:UploadLayerPart'

Execute the following commands to create the three CloudFormation Stacks. The Stacks use the same template, ecr-repo-not-shared.yml, with a different Stack name and RepoName parameter values.

# NLP Client microservice
REPO_NAME=nlp-client
aws --region ${CUSTOMER_ECR_REGION} cloudformation create-stack \
--stack-name ecr-repo-${REPO_NAME} \
--template-body file://cfn-templates/ecr-repo-not-shared.yml \
--parameters \
ParameterKey=RepoName,ParameterValue=${REPO_NAME} \
--capabilities CAPABILITY_NAMED_IAM

# NLP Prose microservice
REPO_NAME=prose-app
aws --region ${CUSTOMER_ECR_REGION} cloudformation create-stack \
--stack-name ecr-repo-${REPO_NAME} \
--template-body file://cfn-templates/ecr-repo-not-shared.yml \
--parameters \
ParameterKey=RepoName,ParameterValue=${REPO_NAME} \
--capabilities CAPABILITY_NAMED_IAM
# NLP Language microservice
REPO_NAME=lang-app
aws --region ${CUSTOMER_ECR_REGION} cloudformation create-stack \
--stack-name ecr-repo-${REPO_NAME} \
--template-body file://cfn-templates/ecr-repo-not-shared.yml \
--parameters \
ParameterKey=RepoName,ParameterValue=${REPO_NAME} \
--capabilities CAPABILITY_NAMED_IAM

Below, we see an example of the resulting three ECR repositories.

At this point, we have our four ECR repositories across the two AWS accounts, with the proper ECR Repository Policies applied to each.

Building and Pushing Images to ECR

Next, we will build and push the three NLP application images to their corresponding ECR repositories. To confirm that the ECR policies are working correctly, log in as the VendorDev user and perform the below command.

aws ecr get-login-password --region ${ISV_ECR_REGION} \
| docker login --username AWS --password-stdin ${ISV_ACCOUNT}.dkr.ecr.${ISV_ECR_REGION}.amazonaws.com

Logged in as the vendor’s VendorDev user, build and push the Docker image to the rake-app repository. The Dockerfile and Golang source code are located in each GitHub repository. With Golang and Docker multi-stage builds, we will create very small Docker images, based on Scratch, containing just the compiled Go executable binary. At a mere 7–15 MBs in size, pushing and pulling these Docker images across accounts is very fast.

docker build -t ${ISV_ACCOUNT}.dkr.ecr.${ISV_ECR_REGION}.amazonaws.com/rake-app:1.1.0 . --no-cache
docker push ${ISV_ACCOUNT}.dkr.ecr.${ISV_ECR_REGION}.amazonaws.com/rake-app:1.1.0

Below, we see the output from the vendor’s VendorDev user logging into the rake-app repository.

We see the vendor’s VendorDev user building results and pushing the Docker image to the rake-app repository.

Next, after logging in as the customer’s CustomerDev user, build and push the Docker images to the ECR nlp-client, lang-app, and prose-app repositories. Again, make sure you substitute the variable values below with your pseudo customer’s AWS account and preferred AWS region.

aws ecr get-login-password --region ${CUSTOMER_ECR_REGION} \
| docker login --username AWS --password-stdin ${CUSTOMER_ACCOUNT}.dkr.ecr.${CUSTOMER_ECR_REGION}.amazonaws.com
# nlp-client
docker build -t ${CUSTOMER_ACCOUNT}.dkr.ecr.${CUSTOMER_ECR_REGION}.amazonaws.com/nlp-client:1.1.0 . --no-cache
docker push ${CUSTOMER_ACCOUNT}.dkr.ecr.${CUSTOMER_ECR_REGION}.amazonaws.com/nlp-client:1.1.0
# prose-app
docker build -t ${CUSTOMER_ACCOUNT}.dkr.ecr.${CUSTOMER_ECR_REGION}.amazonaws.com/prose-app:1.1.0 . --no-cache
docker push ${CUSTOMER_ACCOUNT}.dkr.ecr.${CUSTOMER_ECR_REGION}.amazonaws.com/prose-app:1.1.0
# lang-app
docker build -t ${CUSTOMER_ACCOUNT}.dkr.ecr.${CUSTOMER_ECR_REGION}.amazonaws.com/lang-app:1.1.0 . --no-cache
docker push ${CUSTOMER_ACCOUNT}.dkr.ecr.${CUSTOMER_ECR_REGION}.amazonaws.com/lang-app:1.1.0

At this point, each of the four customer ECR repositories has a Docker image pushed to them.

Deploying Locally to Docker Swarm

As a simple demonstration of cross-account ECS access, we will start with Docker Swarm. Logged in as the customer’s CustomerDev user and using the Docker Swarm Stack file included in the project, we can create and run a local copy of our NLP application in our customer’s account. First, we need to log into the vendor’s ECR repository in order to pull the image from the vendor’s ECR registry.

aws ecr get-login-password --region ${ISV_ECR_REGION} \
| docker login --username AWS --password-stdin ${ISV_ACCOUNT}.dkr.ecr.${ISV_ECR_REGION}.amazonaws.com

Once logged in to the vendor’s ECR repository, we will pull the image. Using the docker describe-repositories and docker describe-images, we can list cross-account repositories and images your IAM user has access to if you are unsure.

aws ecr describe-repositories \
--registry-id ${ISV_ACCOUNT} \
--region ${ISV_ECR_REGION} \
--repository-name rake-app
aws ecr describe-images \
--registry-id ${ISV_ACCOUNT} \
--region ${ISV_ECR_REGION} \
--repository-name rake-app
docker pull ${ISV_ACCOUNT}.dkr.ecr.${ISV_ECR_REGION}.amazonaws.com/rake-app:1.1.0

Using the following command, you should see each of our four applications Docker images.

docker image ls --filter=reference='*amazonaws.com/*'

Below, we see an example of the expected terminal output from pulling the image and listing the images.

Build Docker Stack Locally

Next, build the Docker Swarm Stack. The Docker Compose file, stack.yml, is shown below. Note the location of the Docker images.

version: '3.9'
services:
nlp-client:
image: ${CUSTOMER_ACCOUNT}.dkr.ecr.${CUSTOMER_ECR_REGION}.amazonaws.com/nlp-client:1.1.0
networks:
nlp-demo
ports:
target: 8080
published: 8080
protocol: tcp
mode: host
environment:
NLP_CLIENT_PORT
RAKE_ENDPOINT
PROSE_ENDPOINT
LANG_ENDPOINT
API_KEY
rake-app:
image: ${ISV_ACCOUNT}.dkr.ecr.${ISV_ECR_REGION}.amazonaws.com/rake-app:1.1.0
networks:
nlp-demo
environment:
RAKE_PORT
API_KEY
prose-app:
image: ${CUSTOMER_ACCOUNT}.dkr.ecr.${CUSTOMER_ECR_REGION}.amazonaws.com/prose-app:1.1.0
networks:
nlp-demo
environment:
PROSE_PORT
API_KEY
lang-app:
image: ${CUSTOMER_ACCOUNT}.dkr.ecr.${CUSTOMER_ECR_REGION}.amazonaws.com/lang-app:1.1.0
networks:
nlp-demo
environment:
LANG_PORT
API_KEY
networks:
nlp-demo:
volumes:
data: {}
view raw stack.yaml hosted with ❤ by GitHub

Execute the following commands to deploy the Docker Stack to Docker Swarm. Again, make sure you substitute the variable values below with your pseudo vendor and customer AWS accounts and regions. Additionally, the NLP API uses an API Key to protect all exposed endpoints, except the /health endpoint, across all four services. Change the default CloudFormation template’s API Key parameter to something more secure.

# change me
export ISV_ACCOUNT=111222333444
export ISV_ECR_REGION=us-east-2
export CUSTOMER_ACCOUNT=999888777666
export CUSTOMER_ECR_REGION=us-west-2
export API_KEY=SuP3r5eCRetAutHK3y

# don't change me
export NLP_CLIENT_PORT=8080
export RAKE_PORT=8080
export PROSE_PORT=8080
export LANG_PORT=8080
export RAKE_ENDPOINT=http://rake-app:${RAKE_PORT}
export PROSE_ENDPOINT=http://prose-app:${PROSE_PORT}
export LANG_ENDPOINT=http://lang-app:${LANG_PORT}
export TEXT="The Nobel Prize is regarded as the most prestigious award in the World. Notable winners have included Marie Curie, Theodore Roosevelt, Albert Einstein, George Bernard Shaw, and Winston Churchill."
 
docker swarm init 
docker stack deploy --compose-file docker/stack.yml nlp

You can check the success of the deployment with either of the following commands:

docker stack ps nlp --no-trunc
docker container ls

Below, we see an example of the expected terminal output.

With the Docker Stack, you can hit the nlp-client service directly on localhost:8080. Unlike Fargate, which requires unique static ports for each container in the task, with Docker, we can choose to run all the containers on the same port without conflict since only the nlp-client service is exposing port :8080. Unlike with ECS, there is no load balancer in front of the Stack, since we only have a single node in our Swarm and thus a single container instance of each microservice for testing.

To test that the images were pulled successfully and the Docker Stack is running, we can execute a curl command against any of the API endpoints, such as /keywords. Below, I am using jq to pretty-print the JSON response payload.

curl -s -X POST \
"http://localhost:${NLP_CLIENT_PORT}/keywords" \
-H 'Content-Type: application/json' \
-H "X-API-Key: ${API_KEY}" \
-d '{"text": "The Internet is the global system of interconnected computer networks that use the Internet protocol suite to link devices worldwide."}' | jq

The resulting JSON response payload indicates that the nlp-client service was reached successfully and that it was then subsequently able to communicate with the rake-app service, whose container image originated from the vendor’s ECR repository.

[
{
"candidate": "interconnected computer networks",
"score": 9
},
{
"candidate": "link devices worldwide",
"score": 9
},
{
"candidate": "internet protocol suite",
"score": 8
},
{
"candidate": "global system",
"score": 4
},
{
"candidate": "internet",
"score": 2
}
]

Creating Amazon ECS Environment

Although using Docker Swarm locally is a great way to understand how cross-account ECR access works, it is not a typical use case for deploying containerized applications on the AWS Platform. More often, you could use Amazon ECS, Amazon Elastic Kubernetes Service (EKS), or enterprise versions of third-party orchestrators such as RedHat OpenShift or Rancher.

Using CloudFormation and some very convenient CloudFormation templates supplied by Amazon as a starting point, we will create a complete ECS environment for our application. First, we will create a VPC to house the ECS cluster and a public-facing ALB to front our ECS-based application, using the public-vpc.yml template.

aws --region ${CUSTOMER_ECR_REGION} cloudformation create-stack \
--stack-name public-vpc \
--template-body file://cfn-templates/public-vpc.yml \
--capabilities CAPABILITY_NAMED_IAM

Next, we will create the ECS cluster and an Amazon ECS Task Definition using the public-subnet-public-loadbalancer.yml template. Again, the Task Definition defines how ECS will deploy our application using AWS Fargate. Amazon Fargate allows you to run containers without having to manage servers or clusters. No EC2 instances to manage! Woot! Below, in the CloudFormation template, we see the ContainerDefinitions section of the TaskDefinition resource that contains three container definitions. Note the three images and their ECR locations.

TaskDefinition:
Type: AWS::ECS::TaskDefinition
DependsOn: CloudWatchLogsGroup
Properties:
Family: !Ref 'ServiceNameClient'
Cpu: !Ref 'ContainerCpu'
Memory: !Ref 'ContainerMemory'
NetworkMode: awsvpc
RequiresCompatibilities:
FARGATE
EC2
ExecutionRoleArn:
Fn::ImportValue:
!Join [':', [!Ref 'StackName', 'ECSTaskExecutionRole']]
TaskRoleArn:
Fn::If:
'HasCustomRole'
!Ref 'Role'
!Ref 'AWS::NoValue'
ContainerDefinitions:
Name: nlp-client
Cpu: 256
Memory: 1024
Image: !Join ['.', [!Ref 'AWS::AccountId', 'dkr.ecr', !Ref 'AWS::Region', 'amazonaws.com/nlp-client:1.1.0']]
PortMappings:
ContainerPort: !Ref ContainerPortClient
Essential: true
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-region: !Ref AWS::Region
awslogs-group: !Ref CloudWatchLogsGroup
awslogs-stream-prefix: ecs
Environment:
Name: NLP_CLIENT_PORT
Value: !Ref ContainerPortClient
Name: RAKE_ENDPOINT
Value: !Join [':', ['http://localhost', !Ref ContainerPortRake]]
Name: PROSE_ENDPOINT
Value: !Join [':', ['http://localhost', !Ref ContainerPortProse]]
Name: LANG_ENDPOINT
Value: !Join [':', ['http://localhost', !Ref ContainerPortLang]]
Name: API_KEY
Value: !Ref ApiKey
Name: rake-app
Cpu: 256
Memory: 1024
Image: !Join ['.', [!Ref VendorAccountId, 'dkr.ecr', !Ref VendorEcrRegion, 'amazonaws.com/rake-app:1.1.0']]
Essential: true
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-region: !Ref AWS::Region
awslogs-group: !Ref CloudWatchLogsGroup
awslogs-stream-prefix: ecs
Environment:
Name: RAKE_PORT
Value: !Ref ContainerPortRake
Name: API_KEY
Value: !Ref ApiKey
Name: prose-app
Cpu: 256
Memory: 1024
Image: !Join ['.', [!Ref 'AWS::AccountId', 'dkr.ecr', !Ref 'AWS::Region', 'amazonaws.com/prose-app:1.1.0']]
Essential: true
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-region: !Ref AWS::Region
awslogs-group: !Ref CloudWatchLogsGroup
awslogs-stream-prefix: ecs
Environment:
Name: PROSE_PORT
Value: !Ref ContainerPortProse
Name: API_KEY
Value: !Ref ApiKey
Name: lang-app
Cpu: 256
Memory: 1024
Image: !Join ['.', [!Ref 'AWS::AccountId', 'dkr.ecr', !Ref 'AWS::Region', 'amazonaws.com/lang-app:1.1.0']]
Essential: true
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-region: !Ref AWS::Region
awslogs-group: !Ref CloudWatchLogsGroup
awslogs-stream-prefix: ecs
Environment:
Name: LANG_PORT
Value: !Ref ContainerPortLang
Name: API_KEY
Value: !Ref ApiKey

Execute the following command to create the ECS cluster and an ECS Task Definition using the CloudFormation template.

# change me
export ISV_ACCOUNT=111222333444
export ISV_ECR_REGION=us-east-2
export API_KEY=SuP3r5eCRetAutHK3y
aws cloudformation create-stack \
--stack-name public-subnet-public-loadbalancer \
--template-body file://cfn-templates/public-subnet-public-loadbalancer.yml \
--parameters \
ParameterKey=VendorAccountId,ParameterValue=${ISV_ACCOUNT} \
ParameterKey=VendorEcrRegion,ParameterValue=${ISV_ECR_REGION} \
ParameterKey=ApiKey,ParameterValue=${API_KEY} \
--capabilities CAPABILITY_NAMED_IAM

Below, we see an example of the expected output from the CloudFormation Management Console.

If you want to update the ECS Task Definition, simply run the aws cloudformation update-stack command.

CloudWatch Container Insights

The CloudFormation template does not enable CloudWatch Container Insights by default. Container Insights collects, aggregates, and summarizes metrics and logs from your containerized applications. To enable Insights, execute the following command:

aws ecs put-account-setting \
--name "containerInsights" --value "enabled"

Confirming the Cross-account Policy

If everything went right in the previous steps, we should now have an ECS cluster running our containerized application, including the container built from the vendor’s Docker image. Below, we see an example of the ECS cluster displayed in the management console.

Within the ECR cluster, we should observe a single running ECS Service. According to AWS, Amazon ECS allows you to run and maintain a specified number of instances of a task definition simultaneously in an Amazon ECS cluster; this is referred to as a Service.

We are running two instances of each container on ECS, thus two copies of the task within a single service. Each task runs its containers in a different Availability Zone for high availability.

Drilling into the service, note the new ALB associated with the new VPC, two public subnets, and the corresponding security group.

Switching to the Task Definitions tab, note that the ECS Task contains four container definitions that comprise the NLP API. Three images originated from the customer’s ECR repositories, and one from the vendor’s ECR repository.

Drilling in further, we will see the details of each container definition, including environment variables, passed from ECR to the container and on to the Golang binary running in the container.

Accessing the NLP API on ECS

With our earlier Docker Swarm example, the curl command was issued against localhost. We now have a public-facing Application Load Balancer (ALB) in front of our ECS-based application. We will use the DNS name of the ALB to hit our application on ECS. The DNS address (A Record) can be obtained from the Load Balancer Management Console, as shown below, or from the Output tab of the public-vpc CloudFormation Stack.

Another difference between the earlier Docker Swarm example and ECS is the port. Although the edge service, nlp-client, runs on port :8080, the ALB acts as a reverse proxy, passing requests from port :80 on the ALB to port :8080 of the nlp-client container instances (actually, the shared ENI of the running task).

Although I did set up a custom domain name for the ALB using Route53 and enabled HTTPS (port 443 on the ELB), https://nlp-ecs.example-api.com, for the sake of brevity, I will not go into the details in this post.

To test our deployed ECS, we can use a tool like curl or Postman to test the API’s endpoints. Don’t forget to you will need to add the API Key for authentication using the X-API-Key header key/value pair. Below we see a successful GET against the /routes endpoint, using Postman.

Here we see a successful POST against the /keywords endpoint, using Postman.

Cleaning Up

To clean up the demonstration’s AWS resources and Docker Stack, run the following scripts in the appropriate AWS accounts. Importantly, similar to S3, you must delete all the Docker images in the ECR repositories first, before deleting the repository, or else you will receive a CloudFormation error. This includes untagged images.

# local docker stack
docker stack rm nlp
# customer account
aws ecr batch-delete-image \
--repository-name nlp-client \
--image-ids imageTag=1.1.0
aws ecr batch-delete-image \
--repository-name prose-app \
--image-ids imageTag=1.1.0
aws ecr batch-delete-image \
--repository-name lang-app \
--image-ids imageTag=1.1.0
aws cloudformation delete-stack \
--stack-name ecr-repo-nlp-client
aws cloudformation delete-stack \
--stack-name ecr-repo-prose-app
aws cloudformation delete-stack \
--stack-name ecr-repo-lang-app
aws cloudformation delete-stack \
--stack-name public-subnet-public-loadbalancer
aws cloudformation delete-stack \
--stack-name public-vpc
aws cloudformation delete-stack \
--stack-name development-user-group-customer
# vendor account
aws ecr batch-delete-image \
--repository-name rake-app \
--image-ids imageTag=1.1.0
aws cloudformation delete-stack \
--stack-name ecr-repo-rake-app
aws cloudformation delete-stack \
--stack-name development-user-group-isv

Conclusion

In the preceding post, we saw how multiple AWS accounts could share private ECR-based Docker images. There are variations and restrictions to the configuration of the ECR Repository Policies, depending on the deployment tools you are using, such as AWS CodeBuild, AWS CodeDeploy, or AWS Elastic Beanstalk. AWS does a good job of providing some examples in their documentation, including Amazon ECR Repository Policy Examples and Amazon Elastic Container Registry Identity-Based Policy Examples.

In late 2020, AWS released Amazon Elastic Container Registry Public (ECR Public). Although this post was about private images, for public images, ECR Public allows you to store, manage, share, and deploy container images for anyone to discover and download globally.


This blog represents my own viewpoints and not of my employer, Amazon Web Services (AWS). All product names, logos, and brands are the property of their respective owners.

, , , ,

Leave a comment

Amazon ECR Cross-Account Access for Containerized Applications on ECS

There is an updated April 2021 version of this post, which uses AWS CLI version 2 commands for ECR and updated versions of the Docker images. Please refer to this newer post.

Recently, I was asked a question regarding sharing Docker images from one AWS Account’s Amazon Elastic Container Registry (ECR) with another AWS Account who was deploying to Amazon Elastic Container Service (ECS) with AWS Fargate. The answer was relatively straightforward, use ECR Repository Policies to allow cross-account access to pull images. However, the devil is always in the implementation details. Constructing ECR Repository Policies can depend on your particular architecture, choice deployment tools, and method of account access. In this brief post, we will explore a common architectural scenario that requires configuring ECR Repository Policies to support sharing images across AWS Accounts.

Sharing Images

There are two scenarios I frequently encounter, which require sharing ECR-based Docker images across multiple AWS Accounts. In the first scenario, a vendor wants to securely share a Docker Image with their customer. Many popular container security and observability solutions function in this manner.

Below, we see an example where an application platform consists of three containers. Two of the container’s images originated from the customer’s own ECR repositories (right side). The third container’s image originated from their vendor’s ECR registry (left side).

ecs-example-1.png

In the second scenario, an enterprise operates multiple AWS Accounts to create logical security boundaries between environments and responsibilities. The first AWS Account contains the enterprise’s deployable binary assets, including ECR image repositories. The enterprise has additional accounts, one for each application environment, such as Dev, Test, Staging, and Production. The ECR images in the repository account need to be accessed from each of the environment accounts, often across multiple AWS Regions.

Below, we see an example where the deployed application platform consists of three containers, of which all images originated from the ECR repositories (left side). The images are pulled into the Production account for deployment to ECS (right side).

ecs-example-2

Demonstration

In this post, we will explore the first scenario, a vendor wants to securely share a Docker Image with their customer. We will demonstrate how to share images across AWS Accounts for use with Docker Swarm and ECS with Fargate, using ECR Repository Policies. To accomplish this scenario, we will use an existing application I have created, a RESTful, HTTP-based NLP (Natural Language Processing) API, consisting of three Golang microservices. The edge service, nlp-client, communicates with the rake-app service and the prose-app service.

ecs-example-3

The scenario in the demonstration is that the customer has developed the nlp-client and prose-app container-based services, as part of their NLP application. Instead of developing their own implementation of the RAKE (Rapid Automatic Keyword Extraction) algorithm, they have licensed a version from a vendor, the rake-app service, in the form of a Docker Image.

The NPL API exposes several endpoints, accessible through the nlp-client service. The endpoints perform common NLP operations on text, such as extracting keywords, tokens, and entities. All the endpoints are visible by hitting the /routes endpoint.

[
  {
    "method": "POST",
    "path": "/tokens",
    "name": "main.getTokens"
  },
  {
    "method": "POST",
    "path": "/entities",
    "name": "main.getEntities"
  },
  {
    "method": "GET",
    "path": "/health",
    "name": "main.getHealth"
  },
  {
    "method": "GET",
    "path": "/routes",
    "name": "main.getRoutes"
  },
  {
    "method": "POST",
    "path": "/keywords",
    "name": "main.getKeywords"
  }
]

Requirements

To follow along with the demonstration, you will need two AWS Accounts, one representing the vendor and one representing one of their customers. It’s relatively simple to create additional AWS Accounts, all you need is a unique email address (easy with Gmail) and a credit card. Using AWS Organizations can make the task of creating and managing multiple accounts even easier.

I have purposefully used different AWS Regions within each account to demonstrate how you can share ECR images across both AWS Accounts and Regions. You will need a recent version of the AWS CLI and Docker. Lastly, you will need sufficient access to each AWS Account to create resources.

Source Code

The demonstration’s source code is contained in four public GitHub repositories. The first repository contains all the CloudFormation templates and the Docker Compose Stack file, as shown below.

.
├── LICENSE
├── README.md
├── cfn-templates
│   ├── developer-user-group.yml
│   ├── ecr-repo-not-shared.yml
│   ├── ecr-repo-shared.yml
│   ├── public-subnet-public-loadbalancer.yml
│   └── public-vpc.yml
└── docker
    └── stack.yml

Each of the other three GitHub repositories contains a single Go-based microservice, which together comprises the NLP application. Each respository also contains a Dockerfile.

.
├── Dockerfile
├── LICENSE
├── README.md
├── buildspec.yml
└── main.go

The commands required to clone the four repositories are as follows.

git clone --branch master \
    --single-branch --depth 1 --no-tags \
    https://github.com/garystafford/ecr-cross-account-demo.git 

git clone --branch master \
    --single-branch --depth 1 --no-tags \
    https://github.com/garystafford/nlp-client.git

git clone --branch master \
    --single-branch --depth 1 --no-tags \
    https://github.com/garystafford/prose-app.git

git clone --branch master \
    --single-branch --depth 1 --no-tags \
    https://github.com/garystafford/rake-app.git

Process Overview

We will use AWS CloudFormation to create the necessary resources within both AWS Accounts. For the customer account, we will also use CloudFormation to create an ECS Cluster and an Amazon ECS Task Definition. The Task Definition defines how ECS will deploy our application, consisting of three Docker containers, using AWS Fargate. In addition to ECS, we will create an Amazon Virtual Private Cloud (VPC) to house the ECS cluster and a public-facing, Layer 7 Application Load Balancer (ALB) to load-balance our ECS-based application.

Throughout the post, I will use AWS Cloud9, the cloud-based integrated development environment (IDE), to execute all CloudFormation templates using the AWS CLI. I will also use Cloud9 to build and push the Docker images to the ECR repositories. Personally, I find Cloud9 easier to switch between multiple AWS Accounts and AWS Identity and Access Management (IAM) Users, using separate instances of Cloud9, verses using my local workstation. Conveniently, Cloud9 comes preinstalled with many of the tools you will need for this demonstration.

Creating ECR Repositories

In the first AWS Account, representing the vendor, we will execute two CloudFormation templates. The first template, developer-user-group.yml, creates the Development IAM Group and User. The Developer-01 IAM User will be given explicit access to the vendor’s rake-app ECR repository. I suggest you change the DevUserPassword parameter’s value to something more secure.

# change me
IAM_USER_PSWD=T0pS3cr3Tpa55w0rD 

aws cloudformation create-stack \
    --stack-name developer-user-group.yml \
    --template-body file://cfn-templates/developer-user-group.yml \
    --parameters \
        ParameterKey=DevUserPassword,ParameterValue=${IAM_USER_PSWD} \
    --capabilities CAPABILITY_NAMED_IAM

Below, we see an example of the resulting CloudFormation Stack showing the new Development IAM User and Group.

screen_shot_2019-10-27_at_7_05_25_pm

Next, we will execute the second CloudFormation template, ecr-repo-shared.yml, which creates the vendor’s rake-app ECR image repository. The rake-app repository will house a copy of the vendor’s rake-app Docker Image. But first, let’s look at the CloudFormation template used to create the repository, specifically the RepositoryPolicyText section. Here we define two repository policies:

  • The AllowPushPull policy explicitly allows the Developer-01 IAM User to push and pull versions of the image to the ECR repository.  We import the exported Amazon Resource Name (ARN) of the Developer-01 IAM User from the previous CloudFormation Stack Outputs. We have also allowed the AWS CodeBuild service access to the ECR repository. This is known as a Service-Linked Role. We will not use CodeBuild in this brief post.
  • The AllowPull policy allows anyone in the customer’s AWS Account (root) to pull any version of the image. They cannot push, only pull. Of course, cross-account access can be restricted to a finer-grained set of the specific customer’s IAM Entities and source IP addresses.

Note the "ecr:GetAuthorizationToken" policy Action. Later, when the customer needs to pull this vendor’s image, this Action will allow the customer’s User to log into the vendor’s ECR repository and receive an Authorization Token. The customer retrieves a token that is valid for a specified container registry for 12 hours.

RepositoryPolicyText:
  Version: '2012-10-17'
  Statement:
    - Sid: AllowPushPull
      Effect: Allow
      Principal:
        Service: codebuild.amazonaws.com
        AWS:
          Fn::ImportValue:
            !Join [':', [!Ref 'StackName', 'DevUserArn']]
      Action:
        - 'ecr:BatchCheckLayerAvailability'
        - 'ecr:BatchGetImage'
        - 'ecr:CompleteLayerUpload'
        - 'ecr:DescribeImages'
        - 'ecr:DescribeRepositories'
        - 'ecr:GetDownloadUrlForLayer'
        - 'ecr:GetRepositoryPolicy'
        - 'ecr:InitiateLayerUpload'
        - 'ecr:ListImages'
        - 'ecr:PutImage'
        - 'ecr:UploadLayerPart'
    - Sid: AllowPull
      Effect: Allow
      Principal:
        AWS: !Join [':', ['arn:aws:iam:', !Ref 'CustomerAccount', 'root']]
      Action:
        - 'ecr:GetAuthorizationToken'
        - 'ecr:BatchCheckLayerAvailability'
        - 'ecr:GetDownloadUrlForLayer'
        - 'ecr:BatchGetImage'
        - 'ecr:DescribeRepositories' # optional permission
        - 'ecr:DescribeImages' # optional permission

Before executing the following command to deploy the CloudFormation Stack, ecr-repo-shared.yml, replace the CustomerAccount value, shown below, with your pseudo customer’s AWS Account ID.

# change me
CUSTOMER_ACCOUNT=999888777666

# don't change me
REPO_NAME=rake-app
 
aws cloudformation create-stack \
    --stack-name ecr-repo-${REPO_NAME} \
    --template-body file://cfn-templates/ecr-repo-shared.yml \
    --parameters \
        ParameterKey=CustomerAccount,ParameterValue=${CUSTOMER_ACCOUNT} \
        ParameterKey=RepoName,ParameterValue=${REPO_NAME} \
    --capabilities CAPABILITY_NAMED_IAM

Below, we see an example of the resulting CloudFormation Stack showing the new ECR repository.

screen_shot_2019-10-27_at_7_10_12_pm

Below, we see the ECR repository policies applied correctly in the Permissions tab of the rake-app repository. The first policy covers both the Developer-01 IAM User, referred to as an IAM Entity, as well as AWS CodeBuild, referred to as a Service Principal.

screen_shot_2019-10-27_at_8_59_52_pm

The second policy covers the customer’s AWS Account ID.

screen_shot_2019-10-27_at_9_00_09_pm

Repeat this process in the customer’s AWS Account. First, the CloudFormation template, developer-user-group.yml, containing Development IAM Group and Developer-01 User.

# change me
IAM_USER_PSWD=T0pS3cr3Tpa55w0rD 

aws cloudformation create-stack \
    --stack-name developer-user-group.yml \
    --template-body file://cfn-templates/developer-user-group.yml \
    --parameters \
        ParameterKey=DevUserPassword,ParameterValue=${IAM_USER_PSWD} \
    --capabilities CAPABILITY_NAMED_IAM

Next, we will execute the second CloudFormation template, ecr-repo-not-shared.yml, twice, once for each of the customer’s two ECR repositories, nlp-client and prose-app. First, let’s look at the template, specifically the RepositoryPolicyText section. In this CloudFormation template, we only define a single policy. Identical to the vendor’s policy, the AllowPushPull policy explicitly allows the previously-created Developer-01 IAM User to push and pull versions of the image to the ECR repository. There is no cross-account access required to the customer’s two ECR repositories.

RepositoryPolicyText:
  Version: '2012-10-17'
  Statement:
    - Sid: AllowPushPull
      Effect: Allow
      Principal:
        Service: codebuild.amazonaws.com
        AWS:
          Fn::ImportValue:
            !Join [':', [!Ref 'StackName', 'DevUserArn']]
      Action:
        - 'ecr:BatchCheckLayerAvailability'
        - 'ecr:BatchGetImage'
        - 'ecr:CompleteLayerUpload'
        - 'ecr:DescribeImages'
        - 'ecr:DescribeRepositories'
        - 'ecr:GetDownloadUrlForLayer'
        - 'ecr:GetRepositoryPolicy'
        - 'ecr:InitiateLayerUpload'
        - 'ecr:ListImages'
        - 'ecr:PutImage'
        - 'ecr:UploadLayerPart'

Execute the following commands to create the two CloudFormation Stacks. The Stacks use the same template with a different Stack name and RepoName parameter values.

# nlp-client
REPO_NAME=nlp-client
aws cloudformation create-stack \
    --stack-name ecr-repo-${REPO_NAME} \
    --template-body file://cfn-templates/ecr-repo-not-shared.yml \
    --parameters \
        ParameterKey=RepoName,ParameterValue=${REPO_NAME} \
    --capabilities CAPABILITY_NAMED_IAM
    
# prose-app
REPO_NAME=prose-app 
aws cloudformation create-stack \
    --stack-name ecr-repo-${REPO_NAME} \
    --template-body file://cfn-templates/ecr-repo-not-shared.yml \
    --parameters \
        ParameterKey=RepoName,ParameterValue=${REPO_NAME} \
    --capabilities CAPABILITY_NAMED_IAM

Below, we see an example of the resulting two ECR repositories.

screen_shot_2019-10-27_at_9_35_49_pm

At this point, we have our three ECR repositories across the two AWS Accounts, with the proper ECR Repository Policies applied to each.

Building and Pushing Images to ECR

Next, we will build and push the three NLP application images to their corresponding ECR repositories. To confirm the ECR policies are working correctly, log in as the Developer-01 IAM User to perform the following actions.

Logged in as the vendor’s Developer-01 IAM User, build and push the Docker image to the rake-app repository. The Dockerfile and Go source code is located in each GitHub repository. With Go and Docker multi-stage builds, we will make super small Docker images, based on Scratch, with just the compiled Go executable binary. At less than 10–20 MBs in size, pushing and pulling these Docker images, even across accounts, is very fast. Make sure you substitute the variable values below with your pseudo vendor’s AWS Account and Region. I am using the acroymn, ISV (Independent Software Vendor) for the vendor.

# change me
ISV_ACCOUNT=111222333444
ISV_ECR_REGION=us-east-2

$(aws ecr get-login --no-include-email --region ${ISV_ECR_REGION})
docker build -t ${ISV_ACCOUNT}.dkr.ecr.${ISV_ECR_REGION}.amazonaws.com/rake-app:1.1.0 . --no-cache
docker push ${ISV_ACCOUNT}.dkr.ecr.${ISV_ECR_REGION}.amazonaws.com/rake-app:1.1.0

Below, we see the output from the vendor’s Developer-01 IAM User logging into the rake-app repository.

screen_shot_2019-10-27_at_8_55_06_pm

Then, we see the results of the vendor’s Development IAM User building and pushing the Docker Image to the rake-app repository.

screen_shot_2019-10-27_at_8_56_23_pm

Next, logged in as the customer’s Developer-01 IAM User, build and push the Docker images to the ECR nlp-client and prose-app repositories. Again, make sure you substitute the variable values below with your pseudo customer’s AWS Account and preferred Region.

# change me
CUSTOMER_ACCOUNT=999888777666
CUSTOMER_ECR_REGION=us-west-2

$(aws ecr get-login --no-include-email --region ${CUSTOMER_ECR_REGION})
docker build -t ${CUSTOMER_ACCOUNT}.dkr.ecr.${CUSTOMER_ECR_REGION}.amazonaws.com/nlp-client:1.1.0 . --no-cache
docker push ${CUSTOMER_ACCOUNT}.dkr.ecr.${CUSTOMER_ECR_REGION}.amazonaws.com/nlp-client:1.1.0

docker build -t ${CUSTOMER_ACCOUNT}.dkr.ecr.${CUSTOMER_ECR_REGION}.amazonaws.com/prose-app:1.1.0 . --no-cache
docker push ${CUSTOMER_ACCOUNT}.dkr.ecr.${CUSTOMER_ECR_REGION}.amazonaws.com/prose-app:1.1.0

At this point, each of the three ECR repositories has a Docker Image pushed to them.

Deploying Locally to Docker Swarm

As a simple demonstration of cross-account ECS access, we will start with Docker Swarm. Logged in as the customer’s Developer-01 IAM User and using the Docker Swarm Stack file included in the project, we can create and run a local copy of our NLP application in our customer’s account. First, we need to log into the vendor’s ECR repository in order to pull the image from the vendor’s ECR registry.

# change me
ISV_ACCOUNT=111222333444
ISV_ECR_REGION=us-east-2

aws ecr get-login \
    --registry-ids ${ISV_ACCOUNT} \
    --region ${ISV_ECR_REGION} \
    --no-include-email

The aws ecr get-login command simplifies the login process by returning a (very lengthy) docker login command in response (shown abridged below). According to AWS, the authorizationToken returned for each registry specified is a base64 encoded string that can be decoded and used in a docker login command to authenticate to an ECR registry.

docker login -u AWS -p eyJwYXlsb2FkI...joidENXMWg1WW0 \
    https://111222333444.dkr.ecr.us-east-2.amazonaws.com

Copy, paste and execute the entire docker login command back into your terminal. Below, we see an example of the expected terminal output from logging into the vendor’s ECR repository.

screen_shot_2019-10-28_at_7_05_03_am

Once successfully logged in to the vendor’s ECR repository, we will pull the image. Using the docker describe-repositories  and docker describe-images, we can list cross-account repositories and images your IAM User has access to if you are unsure.

aws ecr describe-repositories \
    --registry-id ${ISV_ACCOUNT} \
    --region ${ISV_ECR_REGION} \
    --repository-name rake-app

aws ecr describe-images \
    --registry-id ${ISV_ACCOUNT} \
    --region ${ISV_ECR_REGION} \
    --repository-name rake-app

docker pull ${ISV_ACCOUNT}.dkr.ecr.${ISV_ECR_REGION}.amazonaws.com/rake-app:1.1.0

Running the following command, you should see each of our three application Docker Images.

docker image ls --filter=reference='*amazonaws.com/*'

Below, we see an example of the expected terminal output from pulling the image and listing the images.

screen_shot_2019-10-28_at_7_11_32_am

Build the Docker Stack Locally

Next, build the Docker Swarm Stack. The Docker Compose file, stack.yml, is shown below. Note the location of the Images.

version: '3.7'

services:
  nlp-client:
    image: ${CUSTOMER_ACCOUNT}.dkr.ecr.${CUSTOMER_ECR_REGION}.amazonaws.com/nlp-client:1.1.0
    networks:
      - nlp-demo
    ports:
      - 8080:8080
    environment:
      - NLP_CLIENT_PORT
      - RAKE_ENDPOINT
      - PROSE_ENDPOINT
      - API_KEY
  rake-app:
    image: ${ISV_ACCOUNT}.dkr.ecr.${ISV_ECR_REGION}.amazonaws.com/rake-app:1.1.0
    networks:
      - nlp-demo
    environment:
      - RAKE_PORT
      - API_KEY
  prose-app:
    image: ${CUSTOMER_ACCOUNT}.dkr.ecr.${CUSTOMER_ECR_REGION}.amazonaws.com/prose-app:1.1.0
    networks:
      - nlp-demo
    environment:
      - PROSE_PORT
      - API_KEY

networks:
  nlp-demo:

volumes:
  data: {}

Execute the following commands to deploy the Docker Stack to Docker Swarm. Again, make sure you substitute the variable values below with your pseudo vendor and customer’s AWS Accounts and Regions. Additionally, API uses an API Key to protect all exposed endpoints, except the /health endpoint, across all three services. You should change the default CloudFormation template’s API Key parameter to something more secure.

# change me
export ISV_ACCOUNT=111222333444
export ISV_ECR_REGION=us-east-2
export CUSTOMER_ACCOUNT=999888777666
export CUSTOMER_ECR_REGION=us-west-2
export API_KEY=SuP3r5eCRetAutHK3y
 
# don't change me
export NLP_CLIENT_PORT=8080
export RAKE_PORT=8080
export PROSE_PORT=8080
export RAKE_ENDPOINT=http://rake-app:${RAKE_PORT}
export PROSE_ENDPOINT=http://prose-app:${PROSE_PORT}
 
docker swarm init 
docker stack deploy --compose-file stack.yml nlp

We can check the success of the deployment with the following commands.

docker stack ps nlp --no-trunc
docker container ls

Below, we see an example of the expected terminal output.

screen_shot_2019-10-23_at_11_56_38_pm.png

With the Docker Stack, we can hit the nlp-client service directly on localhost:8080. Unlike Fargate, which requires unique static ports for each container in the task, with Docker, we can choose run all the containers on the same port without conflict since only the nlp-client service is exposing port :8080. Additionally, there is no load balancer in front of the Stack, unlike ECS, since we only have a single node in our Swarm, and thus a single container instance of each microservice.

ecs-example-4

To test that the images were pulled successfully and the Docker Stack is running, we can execute a curl command against any of the API endpoints, such as /keywords. Below, I am using jq to pretty-print the JSON response payload.

#change me
API_KEY=SuP3r5eCRetAutHK3y

curl -s -X POST \
    "http://localhost:${NLP_CLIENT_PORT}/keywords" \
    -H 'Content-Type: application/json' \
    -H "X-API-Key: ${API_KEY}" \
    -d '{"text": "The Internet is the global system of interconnected computer networks that use the Internet protocol suite to link devices worldwide."}' | jq

The resulting JSON payload should look similar to the following output. These results indicate that the nlp-client service was reached successfully and that it was then subsequently able to communicate with the rake-app service, whose container image originated from the vendor’s ECR repository.

[
    {
        "candidate": "interconnected computer networks",
        "score": 9
    },
    {
        "candidate": "link devices worldwide",
        "score": 9
    },
    {
        "candidate": "internet protocol suite",
        "score": 8
    },
    {
        "candidate": "global system",
        "score": 4
    },
    {
        "candidate": "internet",
        "score": 2
    }
]

Creating Amazon ECS Environment

Although using Docker Swarm locally is a great way to understand how cross-account ECR access works, it is not a typical use case for deploying containerized applications on the AWS Platform. More often, you would use Amazon ECS, Amazon Elastic Kubernetes Service (EKS), or enterprise versions of third-party orchestrators such as Docker Enterprise, RedHat OpenShift, or Rancher.

Using CloudFormation and some very convenient CloudFormation templates supplied by Amazon as a starting point, we will create a complete ECS environment for our application. First, we will create a VPC to house the ECS cluster and a public-facing ALB to front our ECS-based application, using the public-vpc.yml template.

aws cloudformation create-stack \
    --stack-name public-vpc \
    --template-body file://public-vpc.yml \
    --capabilities CAPABILITY_NAMED_IAM

Next, we will create the ECS cluster and an Amazon ECS Task Definition, using the public-subnet-public-loadbalancer.yml template. Again, the Task Definition defines how ECS will deploy our application using AWS Fargate. Amazon Fargate allows you to run containers without having to manage servers or clusters. No EC2 instances to manage! Woot! Below, in the CloudFormation template, we see the ContainerDefinitions section of the TaskDefinition resource, container three container definitions. Note the three images and their ECR locations.

ContainerDefinitions:
  - Name: nlp-client
    Cpu: 256
    Memory: 1024
    Image: !Join ['.', [!Ref AWS::AccountId, 'dkr.ecr', !Ref AWS::Region, 'amazonaws.com/nlp-client:1.1.0']] 
    PortMappings:
      - ContainerPort: !Ref ContainerPortClient
    Essential: true
    LogConfiguration:
      LogDriver: awslogs
      Options:
        awslogs-region: !Ref AWS::Region
        awslogs-group: !Ref CloudWatchLogsGroup
        awslogs-stream-prefix: ecs
    Environment:
      - Name: NLP_CLIENT_PORT
        Value: !Ref ContainerPortClient
      - Name: RAKE_ENDPOINT
        Value: !Join [':', ['http://localhost', !Ref ContainerPortRake]] 
      - Name: PROSE_ENDPOINT
        Value: !Join [':', ['http://localhost', !Ref ContainerPortProse]] 
      - Name: API_KEY
        Value: !Ref ApiKey
  - Name: rake-app
    Cpu: 256
    Memory: 1024
    Image: !Join ['.', [!Ref VendorAccountId, 'dkr.ecr', !Ref VendorEcrRegion, 'amazonaws.com/rake-app:1.1.0']] 
    Essential: true
    LogConfiguration:
      LogDriver: awslogs
      Options:
        awslogs-region: !Ref AWS::Region
        awslogs-group: !Ref CloudWatchLogsGroup
        awslogs-stream-prefix: ecs
    Environment:
      - Name: RAKE_PORT
        Value: !Ref ContainerPortRake
      - Name: API_KEY
        Value: !Ref ApiKey
  - Name: prose-app
    Cpu: 256
    Memory: 1024
    Image: !Join ['.', [!Ref AWS::AccountId, 'dkr.ecr', !Ref AWS::Region, 'amazonaws.com/prose-app:1.1.0']] 
    Essential: true
    LogConfiguration:
      LogDriver: awslogs
      Options:
        awslogs-region: !Ref AWS::Region
        awslogs-group: !Ref CloudWatchLogsGroup
        awslogs-stream-prefix: ecs
    Environment:
      - Name: PROSE_PORT
        Value: !Ref ContainerPortProse
      - Name: API_KEY
        Value: !Ref ApiKey

Execute the following command to create the ECS cluster and an Amazon ECS Task Definition using the CloudFormation template.

# change me
ISV_ACCOUNT=111222333444
ISV_ECR_REGION=us-east-2
AUTH_KEY=SuP3r5eCRetAutHK3y

aws cloudformation create-stack \
    --stack-name public-subnet-public-loadbalancer \
    --template-body file://public-subnet-public-loadbalancer.yml \
    --parameters \
        ParameterKey=VendorAccountId,ParameterValue=${ISV_ACCOUNT} \
        ParameterKey=VendorEcrRegion,ParameterValue=${ISV_ECR_REGION} \
        ParameterKey=ApiKey,ParameterValue=${API_KEY} \
    --capabilities CAPABILITY_NAMED_IAM

Below, we see an example of the expected output from the CloudFormation Management Console.

screen_shot_2019-10-23_at_6_19_25_pm

The CloudFormation template does not enable CloudWatch Container Insights by default. Insights collects, aggregates, and summarizes metrics and logs from your containerized applications. To enable Insights, execute the following command:

aws ecs put-account-setting \
  --name "containerInsights" --value "enabled"

Confirming the Cross-account Policy

If everything went right in the previous steps, we should now have an ECS cluster running our containerized application, including the container built from the vendor’s Docker image. Below, we see an example of the ECS cluster, displayed in the management console.

screen_shot_2019-10-23_at_6_19_36_pm

Within the ECR cluster, we should observe a single running ECS Service. According to AWS, Amazon ECS allows you to run and maintain a specified number of instances of a task definition simultaneously in an Amazon ECS cluster. This is called a service.screen_shot_2019-10-23_at_6_19_42_pm

We are running two instances of each container on ECS, thus two copies of the task within the single service. Each task runs its containers in a different Availability Zone for high-availability.

screen_shot_2019-10-26_at_11_10_06_pm

Drilling into the service, we should note the new ALB, associated with the new VPC, two public Subnets, and the corresponding Security Group.

screen_shot_2019-10-23_at_6_19_49_pm

Switching to the Task Definitions tab, we should see the details of our task. Note the three containers that compose the application. Note that two are located in the customer’s ECR repositories, and one is located in the vendor’s ECR repository.

screen_shot_2019-10-23_at_6_21_03_pm

Drilling in a little farther, we will see the details of each container definition, including environment variables, passed from ECR to the container, and on to the actual Go-binary, running in the container.

screen_shot_2019-10-23_at_6_21_24_pm

Reaching our Application on ECS

Whereas with our earlier Docker Swarm example, the curl command was issued against /localhost, we now have the public-facing Application Load Balancer (ALB) in front of our ECS-based application. We will need to use the DNS name of your ALB as the host, to hit our application on ECS. The DNS address (A Record) can be obtained from the Load Balancer Management Console, as shown below, or from the Output tab of the public-vpc CloudFormation Stack

screen_shot_2019-10-25_at_1_08_00_pm.png

Another difference between the earlier Docker Swarm example and ECS is the port. Although the edge service, nlp-client, runs on port :8080, the ALB acts as a reverse proxy, passing requests from port :80 on the ALB to port :8080 of the nlp-client container instances (actually, the shared ENI of the running task). For the sake of brevity and simplicity, I did not set up a custom domain name for the ALB, nor HTTPS, as you normally would in Production.

ecs-example-3

To test our deployed ECS, we can use a tool like curl or Postman to test the API’s endpoints. Below, we see a POST against the /tokens endpoint, using Postman. Don’t forget to you will need to add Auth, the ‘X-API-Key’ header key/value pair.

screen_shot_2019-10-23_at_10_11_12_pm

Cleaning Up

To clean up the demonstration’s AWS resources and Docker Stack, run the following scripts in the appropriate AWS Accounts. Importantly, similar to S3, you must delete all the Docker images in the ECR repositories first, before deleting the repository, or else you will receive a CloudFormation error. This includes untagged images.

# customer account only
aws ecr batch-delete-image \
    --repository-name nlp-client \
    --image-ids imageTag=1.1.0

aws ecr batch-delete-image \
    --repository-name prose-app \
    --image-ids imageTag=1.1.0

aws cloudformation delete-stack \
    --stack-name ecr-repo-nlp-client

aws cloudformation delete-stack \
    --stack-name ecr-repo-prose-app

aws cloudformation delete-stack \
    --stack-name public-subnet-public-loadbalancer

aws cloudformation delete-stack \
    --stack-name public-vpc
docker stack rm nlp

# vendor account only
aws ecr batch-delete-image \
    --repository-name rake-app \
    --image-ids imageTag=1.1.0

aws cloudformation delete-stack \
    --stack-name ecr-repo-rake-app

# both accounts
aws cloudformation delete-stack \
    --stack-name developer-user-group

Conclusion

In the preceding post, we saw how multiple AWS Accounts can share ECR-based Docker Images. There are variations and restrictions to the configuration of the ECR Repository Policies, depending on the deployment tools you are using, such as AWS CodeBuild, AWS CodeDeploy, or AWS Elastic Beanstalk. AWS does an excellent job of providing some examples in their documentation, including Amazon ECR Repository Policy Examples and Amazon Elastic Container Registry Identity-Based Policy Examples.

All opinions expressed in this post are my own and not necessarily the views of my current or past employers or their clients.

, , , , , ,

Leave a comment