Scaling self-hosted runners

Cloud

Server v4.x

Server v3.x

Introduction

Maintaining a fixed compute fleet of self-hosted runners can incur unnecessary costs as the workload can fluctuate depending on the rate at which jobs are queued. To help reduce this cost, the compute fleet can be scaled according to the demand.

You can view an example tutorial for scaling machine runners with AWS AutoScaling groups on the CircleCI blog.

Container runner

If you’re using the container runner on Kubernetes, it will automatically spin up more pods as more work enters the queue. Those pods are ephemeral and will be torn down after the job is done executing. While pods will scale with the work automatically, CircleCI will not go scale the underlying compute for your Kubernetes cluster.

Scaling data

There are several API endpoints to help you set up a solution to scale machine runners or your Kubernetes cluster for container runners:

A scaling solution can use the above endpoints to calculate the total number of waiting tasks which can be run. The task data endpoints are scoped to a single resource class, so it’s important to query every available resource class to get the total number of running tasks.

If you’re using machine runners, you can devise a scaling solution to add more machine runners to the resource class that has more pending work.

You will also need to be aware of your plan’s concurrency limit to avoid starting compute which cannot be used. This can be found on the CircleCI Plans page.

Agent configuration for machine runner

There are some machine runner configuration settings which can be used by your scaling solution, particularly to assist resource cleanup when the demand drops:

Runner Mode
- Choosing single-task mode will cause machine runner to shut down after a single task. This is useful if using completely ephemeral compute as the resources can be automatically recycled upon machine runner exit.
- Choosing continuous mode will cause the machine runner to poll for new tasks after completing a task. Your scaling solution will need to monitor the task workload and actively shutdown unused machine runners.
Runner Idle Timeout
- Setting a reasonable timeout can be used for automatic resource recycling during periods of lower demand.

Suggest an edit to this page

Make a contribution

Learn how to contribute

Still need help?

Ask the CircleCI community

Join the research community

Visit our Support site