Distributed load testing with Gatling using Docker and AWS
Update:
I have published a part 2 of this article, containing a lot of improvements over my initial setup. You can read it here:
Before our release to production we needed to perform a load test with more than 170.000 concurrent users. Since we already had experience running performance tests with Gatling, it was our goto choice for setting up our load tests. If you are not familiar yet with Gatling, it’s an open source load- and performance test framework with tests written in Scala. It’s very powerful and has excellent reporting.
The challenge was how to reach our desired number of concurrent users. Since it’s not possible to achieve so much load with only one machine, I knew we needed a distributed solution. Gatling offers their own distributed solution Gatling Frontline, but I figured it wouldn’t be much harder to create my own solution. This would give me more control over the final solution (and costs). Since running multiple Gatling instances will result in multiple simulation.log
files, they need to be aggregated into one report. Luckily Gatling has a solution for just this.
Since we already are using AWS for our target application, it was an easy choice for the cloud provider for the load test. And because I wanted something that was easy to automate, scale and reuse, Docker became the preferred choice. On AWS we will use ECR and ECS for this. For collecting the multiple simulation.log
files we will use S3.
Now we all need to assemble it together step by step.
Step 1. Gatling Maven project
I will not go into detail how to setup a Gatling Maven project. Gatling has excellent documentation about this. You can find an example project I created here:
All the gists used in this guide come from this project.
For this guide I will use a simplified example. This will be our simulation:
In this simulation you can use environment variables to override default values. You will see why this is convenient later.
Together with the pom.xml
file we have our basic Gatling Maven project:
We can now run our Gatling Maven project with:mvn clean gatling:test
Step 2. Creating Docker image
Our Docker image will need to do the following things:
- Contain the Gatling testcode
- Being able to execute the test using Maven
- Have all the needed dependencies included (when running the tests I don’t want every container downloading all dependencies on it’s own)
- Upload the result to S3
This will be the Dockerfile:
The entrypoint for the image is run.sh
. This script executes the test and uploads the result to a S3 bucket. The bucket needs to be supplied as an argument:
We can now create our Docker image:
docker build -t gatling-runner .
And run it!
Since the script will try to upload the results to a S3 bucket, we need to make sure our AWS credentials are known in the Docker image. When running on ECS the IAM role will solve this, but when running on our local Docker we will need to provide the correct AWS credentials. You can also provide the AWS profile if needed:
docker run --rm -v ${HOME}/.aws/credentials:/root/.aws/credentials:ro gatling-runner -r <s3_bucket> [-p <aws_profile>]
Step 3. Pushing Docker image to Amazon ECR
To be able to run our image on ECS, we first need to upload it to a Docker registry. Since we are using AWS, we will use ECR for this.
First open your AWS management console, go to ECR and create a new repository. I’ve called mine gatling-runner
. Remember the repository URI, we will need it later.
Now log in with your local docker to this repository:
$(aws ecr get-login --no-include-email)
Tag your local image and push it to your ECR repository. The URI will look something like this, depending on your region: <aws_account_id>.dkr.ecr.eu-west-west-1.amazonaws.com/gatling-runner
docker tag gatling-runner <ecs_repo_uri>
docker push <ecs_repo_uri>
Afterwards you can logout from the ECS repository:
docker logout <ecs_repo_uri>
If you now visit the ECR repository you will see the pushed image:
Step 4. Running Docker container on ECS
Amazon ECS is Amazon’s version of Kubernetes/Docker Swarm. The most important part is that we can run Docker containers here. It is tightly integrated other AWS services, and this is exactly what we are going to make use of.
Since we now have our Docker image available in ECR, we can now create our cluster to run the image on.
There are 2 options when running containers on ECS: EC2 or Fargate. When using EC2 you have to deploy and manage your own cluster of EC2 instances for running containers. When using Fargate you can run containers directly without using EC2 instances. Both options are valid and will work with this setup. Both have pros and cons which I will not discuss here. For the sake of simplicity I will use the Fargate setup in this guide.
To create the cluster we will need the Amazon ECS CLI tools. You can find them here. Make sure you have them installed and configured with your AWS account.
Now configure the ECS Fargate cluster:
ecs-cli configure --cluster gatlingCluster --region eu-west-1 --default-launch-type FARGATE --config-name gatlingConfiguration
Now start the cluster:
ecs-cli up
While creating all the necessary components for the cluster, take note of the created subnets, we will need those later.
After some waiting you will now have a cluster to run Docker containers on!
You can view it in the management console here.
For running our Docker container on the cluster we will use a docker-compose file. Here we can use the environment variables to control our test parameters, without recreating our docker image! The docker-compose file also provides logging in AWS.
To be able to run a container on a Fargate cluster we will need to provide some additional information in a so called ecs_params.yml
file. Make sure that the IAM role inserted for task_role_arn
should have the correct rights for writing to the S3 bucket. Don’t forget this if you want reporting!
Those created subnets I told you to take note of? Insert them here.
Also insert the security group, the default one for the VPC of the cluster will do.
You can play around with these settings to find an optimal combination that works for you. Change mem_limit
and cpu_limit
if you experience memory issues/slow tests.
We can now run a single container of our test. --create-log-groups
will prevent an error if you haven’t created the log groups yet.
ecs-cli compose up --create-log-groups
You can use the container id to see some logging:
ecs-cli logs --task-id <container_id>
Nice, it worked! If you check the S3 bucket you will now find the simulation.log
in the logs folder.
To scale our load test, we can change 2 things:
- the number of concurrent users per container
- the number of containers
We can adjust the former by changing the environment variable in the docker-compose file. We can change the latter by using the scale
option:
ecs-cli compose scale 2
Now our load test is executed in 2 containers, resulting in doubling the amount of concurrent users.
The best combination of concurrent users per container and the number of containers will be different for every test. Experiment with this to find the optimal solution.
Step 5. Generating HTML report
Since we now have a collection of simulation.log
files in our S3 bucket, the next step is to generate a Gatling HTML report from these files.
Our Maven project configuration we created earlier has a special configuration for this, which we will use now. The only part left is getting those log files from S3:
This script downloads the simulation.log
files from S3, then calls the Maven configuration which generates the HTML report, and finally uploads the report back to the S3 bucket in the gatling
folder. The last part is of course optional, but a good way to access the final report.
./consolidate_reports -r <bucket_name> [-p <aws_profile>]
The script expects the S3 bucket where the log files are located. You can optionally provide an AWS profile.
Step 6. Cleanup
When we are done with load testing, we can clean it all up with one simple command:
ecs-cli down --force
This command will cleanup the created cluster and all generated components.
Afterthoughts
I am quite happy with my current setup, but there are of course always improvements to be made. I’d like to explore the possibilities of using spot EC2 instances in combination with the cluster to lower the costs even further. Also the whole process should be scripted into a Jenkins build, so I can automate the whole process even further, integrating it in our CI/CD pipeline.
This guide was inspired by:
Distributed load testing with Gatling and Kubernetes
Gatling AWS Maven Plugin
Gatling on ECS