Scaling self-hosted runners
Introduction
Maintaining a fixed compute fleet of self-hosted runners can incur unnecessary costs as the workload can fluctuate depending on the rate at which jobs are queued. To help reduce this cost, the compute fleet can be scaled according to the demand.
You can view an example tutorial for scaling machine runners with AWS AutoScaling groups on the CircleCI blog.
Container runner
If you’re using the container runner on Kubernetes, it will automatically spin up more pods as more work enters the queue. Those pods are ephemeral and will be torn down after the job is done executing. While pods will scale with the work automatically, CircleCI will not go scale the underlying compute for your Kubernetes cluster.
Scaling data
There are several API endpoints to help you set up a solution to scale machine runners or your Kubernetes cluster for container runners:
A scaling solution can use the above endpoints to calculate the total number of waiting tasks which can be run. The task data endpoints are scoped to a single resource class, so it’s important to query every available resource class to get the total number of running tasks.
If you’re using machine runners, you can devise a scaling solution to add more machine runners to the resource class that has more pending work.
You will also need to be aware of your plan’s concurrency limit to avoid starting compute which cannot be used. This can be found on the CircleCI Plans page. |
Agent configuration for machine runner
There are some machine runner configuration settings which can be used by your scaling solution, particularly to assist resource cleanup when the demand drops:
-
-
Choosing
single-task
mode will cause machine runner to shut down after a single task. This is useful if using completely ephemeral compute as the resources can be automatically recycled upon machine runner exit. -
Choosing
continuous
mode will cause the machine runner to poll for new tasks after completing a task. Your scaling solution will need to monitor the task workload and actively shutdown unused machine runners.
-
-
-
Setting a reasonable timeout can be used for automatic resource recycling during periods of lower demand.
-