Distributed load testing with Gatling using Docker and AWS — Part 2

9 min readSep 25, 2020

About a year ago I wrote my initial article about distributed load testing with Gatling using Docker and AWS. Since that time I have improved my initial setup by quite a bit. This article describes my improved guide. I’ll also point out which improvements I made.

This article references the first one quite a bit, it can be found here:

Distributed load testing with Gatling using Docker and AWS

Step by step guide setting up a distributed Gatling load test, using Docker and AWS.

medium.com

Previous setup

In my original setup I used a Gatling Maven project for the load test code. From this a Docker image containing the Maven project was created. This image was then pushed to AWS ECR as the docker registry.
ecs-cli was used to create the needed infra and run the Docker image on AWS. After finishing the test the docker image would write its simulation.log to an S3 bucket, where a bash script could later retrieve it to make the HTML report.

Improvements over the previous setup

Create runnable fat jar for Gatling docker container
Setup infra using AWS CDK instead of ecs-cli
Run load test using AWS SDK instead of bash scripts
Being able to pass environment variables to individual gatling-runners, for example which userId’s to use
Adding realtime monitoring

Example repository

The code examples in this article all come from my Github repository which contains a fully worked out example project:

richardhendricksen/gatling-docker-on-aws

Step 1. Gatling Maven Project

This step is equal to my original article, read it here.

Step 2. Creating Docker image

This will be the first improvement. In the previous guide I would create a Docker image that contains Maven and include the source code. On init the docker container would call mvn clean gatling:test, which in turn uses the gatling-maven-plugin that would compile the code and run the Gatling test.
But the Maven repository is not included in the Docker image, so every time a container is started all the required dependencies will be downloaded and the source needs to be compiled.

What would be better is to use a fat jar that includes the compiled test code and all its dependencies, so the Gatling test can be run directly using java .

Creating the fat jar

First create a jar of the Gatling testcode. To do this add the maven-jar-plugin to the pom.xml:

When running mvn clean package it will create gatling-runner-1.0-SNAPSHOT-tests.jar in the target folder. This jar wil contain all the compiled testcode, but no dependencies. For this the maven-assembly-plugin is used. This plugin however will only include runtime dependencies, so remove the test scope from all the dependencies.
Then add the maven-assembly-plugin to the pom.xml:

This will generate gatling-runner-1.0-SNAPSHOT-jar-with-dependencies.jar in the target folder. With both of these jars combined the Gatling project can be ran using java.

When running the Gatling test using Maven, the gatling-maven-plugin is used. This plugin uses specific Java Runtime options that need to be set.

Combine this in an entrypoint script that will run the Gatling code using java setting the needed JAVA_OPTS. After the tests the script will also upload the simulation.log file to the S3 bucket:

The improved Docker image no longer needs Maven and the source code, it only needs the JRE, aws-cli and both jars:

Build the Docker image with docker build -t gatling-runner . .

When running locally you will need to set the required environment variables:

SIMULATION : The desired Gatling Simulation to run, needs to be full classpath, e.g. nl.codecontrol.gatling.simulations.BasicSimulation
REPORT_BUCKET: The S3 Bucket to write the Gatling results file to
Optionally: AWS_PROFILE : Set this if you want to use a non-default profile

docker run --rm \
-v ${HOME}/.aws/credentials:/root/.aws/credentials:ro \
-e SIMULATION=nl.codecontrol.gatling.simulations.BasicSimulation \
-e REPORT_BUCKET=<S3_BUCKET> \
-e AWS_PROFILE=<AWS_PROFILE> \ 
gatling-runner

Step 3. Setting up infra

In the original setup ecs-cli was used to setup the infra, but it required other infra parts to already exist. For example to run the gatling-runner on an ECS cluster you need a task role with S3 write rights. You also needed to manually create ECR repositories to store the Docker images on AWS.

To streamline this I migrated to AWS CDK. AWS CDK is the AWS Cloud Development Kit. It is a software development framework for defining cloud infrastructure in code and provisioning it through AWS CloudFormation. The AWS CDK supports TypeScript, JavaScript, Python, Java, and C#/.Net. In this guide Java is used.

In the following steps I won’t go in too much detail in how AWS CDK works, but I will show the code that will generate what’s needed. Perhaps in the future I will create a separate article about this topic.

Getting started

Install the AWS CDK:
npm install -g aws-cdk

Now init a new project using:
cdk init app — language java

In setting up the infra an existing VPC and S3 Bucket will be used, which will be passed using env vars. Everything else will be created using the CDK app.

The reason I use an existing S3 bucket, is that when you destroy the stack using cdk and the S3 bucket still contains files, the S3 bucket won’t be destroyed. Then when you want to recreate the stack you will get errors mentioning the S3 bucket you want to create already exists. To prevent this I will just assume you already created the S3 bucket manually.

First setup the CDK App. It will pass the AWS credentials to the stack and expects env vars VPC_ID and REPORT_BUCKET . The app then creates the GatlingRunnerEcsStack , which in turn consists of all the infra parts that are needed.

The GatlingRunnerEcsStack creates:

ECS Cluster to run the tasks on
IAM Roles for running Fargate tasks and writing to the S3 bucket
Docker image builder using the Dockerfile from the Maven Gatling project
Fargate task definition for running the Docker container

It uses helper classes for the Roles, ContainerOptions and TaskDefinition . You can find the source code for those here.

Deploying the infra

First compile the code using: mvn package

You can now deploy the Infra stack using:

VPC_ID=<VPC_ID> \
REPORT_BUCKET=<S3_BUCKET> \
cdk deploy --profile <AWS_PROFILE>

The output will look something like this:

When running for the first time you might get the following error: GatlingRunnerEcsStack failed: Error: This stack uses assets, so the toolkit stack must be deployed to the environment
You can fix this by bootstrapping the CDK on your profile:
cdk bootstrap aws://<aws_id>/region

Step 4. Creating AWS Runner

In the original setup I used mostly bash shell scripting to run the test, for the new setup I moved to using AWS SDK . This gives us more possibilities to setup the testrunner while also being easier to write and maintain.

Setup the Maven project by using the following pom.xml:

It contains all dependencies needed for the testrunner.

The AWSLoadTestRunner does several things:

Checks if there are no current running tasks on the cluster. If true, then it will exit with an error.
It has a Config object containing settings for the testrun. Most important are the number of containers that will be spun up, the number of users per container and the duration.
For the number of requested containers it will create a RunTaskRequest , setting the requested settings using environment variables. After this it will start the task.
When all tasks are started, it will wait until every task is done.

Set the desired config using environment variables and then execute the AWSLoadTestRunner using Maven:

AWS_PROFILE=<AWS_PROFILE> \
VPC_ID=<VPC_ID> \
CLUSTER=gatling-cluster \
TASK_DEFINITION=gatling-runner \
CONTAINERS=10 \
USERS=2000 \
MAX_DURATION=1 export \ SIMULATION=nl.codecontrol.gatling.simulations.BasicSimulation \
mvn clean compile exec:exec

The output will look something like this:

Step 5. Generating HTML report

This step has not changed from the previous article. The Gatling runners still write their simulation.log files to the S3 bucket when they are done. We then retrieve them using aws cli and combine them using built in Gatling functionality. See part 1 of my article for the full explanation.

Step 6. Add realtime monitoring

One of the first things I wanted to add on the previous solution was a way to realtime monitor the load test. This way I could spot performance issues during the test instead of waiting until the full duration of the test was completed when the Gatling report was created.

The best way to use this to let Gatling using the Graphite protocol to write to an InfluxDB and create a Grafana dashboard that uses the InfluxDB as a data source. An explanation for this solution can also be found on the Gatling website.

Creating InfluxDB and Grafana Docker images

Before Gatling can be configured an InfluxDB and Grafana service needs to be setup. I will also use Docker images for this.

First create the InfluxDB docker image:

A custom influxdb.conf is used to set the required templates needed for visualizing later in Grafana. Templates allow matching parts of a metric name to be used as tag names in the stored metric.

The other file copied into the Docker image is an init db script that sets a TTL on the Graphite database for 1d to keep the database size in check:

Then the Grafana image:

This one is a little more complex. I use the provisioning possibilities of the Grafana image to provision the InfluxDB datasource and the dashboard. Since I also want to be able to make screenshots of the dashboard using the Grafana API I needed to setup the renderer plugin as described here.

Check my repository for the InfluxDB datasource and dashboard provision files.

Deploying monitoring infra

The monitoring solution will also be deployed on AWS. To create the needed infrastructure I will again use AWS CDK. The monitoring infra will be a separate stack so I can decide which of both stacks I want to deploy any given time.

First update the GatlingInfraCDKApp to contain the second stack:

The GatlingRealtimeMonitoringEcsStack will create the following:

Create a separate ECS Cluster so it won’t interfere with the gatling runners
Setup the needed IAM roles
Create the needed security group
Create containers for both InfluxDB and Grafana
Setup task definition containing the InfluxDB and Grafana containers
Setup a Fargate service containing the task definition
Setup a Load balancer to create a DNS for the Gatling containers to connect to

Check the repository for the helper classes.

Deploying the monitoring infra

Again, first compile the code using: mvn package

Since there are now two stacks in the CDK App you need to define which stack you want to deploy:

VPC_ID=<VPC_ID> \
REPORT_BUCKET=<S3_BUCKET> \
cdk deploy GatlingMonitoringEcsStack --profile <profile>

This will result in the following output:

The load balancer will have generated a DNS name where the monitoring stack will be available. Go to AWS EC2 to find it:

Or use the aws-cli tooling:

aws elbv2 describe-load-balancers \
                --names gatling-monitoring \
                --query LoadBalancers[].DNSName \
                --output text

On this address the Grafana dashboard will be available on port :80. On port :2003 the graphite endpoint for InfluxDB will be available.

Configure Gatling

Changing the Gatling config is easy. Just enable the graphite config section in the gatling.conf :

Now build the gatling-runner project and redeploy the GatlingRunnerECSStack so the Docker image gets updated on ECR.

Now run a loadtest and watch the dashboard…

Afterthoughts

I finally found time to add part 2 to my initial article and I’m quite happy with my current solution. The maintenance has become so much easier than my initial setup and the realtime monitoring is the best addition by far. I’ll keep improving my current setup over time and those improvements can be found in the Github repository mentioned above in the article.