Start Building for Free

Scaling Self-hosted Runners

2 weeks ago1 min read
Server v3.x
On This Page
  • Introduction
  • Scaling data
  • Agent configuration


Maintaining a fixed compute fleet for CircleCI self-hosted runners can incur unnecessary costs as the workload can fluctuate depending on the rate at which jobs are queued. To help reduce this cost, the compute fleet can be scaled according to the demand.

You can view an example tutorial for scaling self-hosted runners with AWS AutoScaling groups on the CircleCI blog.

Scaling data

There are several API endpoints to help you set up a solution to scale data:

A scaling solution can use the above endpoints to calculate the total number of waiting tasks which can be run. The task data endpoints are scoped to a single resource class, so it’s important to query every available resource class to get the total number of running tasks.

The scaling solution is then free to decide which resource class to add new claiming agents for. The distribution of tasks across resource classes is dependent on the incoming workload and internal dispatching decisions made by CircleCI.

NOTE: You will also need to be aware of your plan’s concurrency limit to avoid starting compute which cannot be used. This can be found on the CircleCI Plans page. Scale plan customers will need to contact support to get this information.

Agent configuration

There are some launch-agent configuration settings which can be used by your scaling solution, particularly to assist resource cleanup when the demand drops:

  • Runner Mode

    • Choosing single-task mode will cause the launch-agent to shut down after a single task. This is useful if using completely ephemeral compute as the resources can be automatically recycled upon launch-agent exit.

    • Choosing continuous mode will cause the launch-agent to poll for new tasks after completing a task. Your scaling solution will need to monitor the task workload and actively shutdown unused launch-agents.

  • Runner Idle Timeout

    • Setting a reasonable timeout can be used for automatic resource recycling during periods of lower demand.

Help make this document better

This guide, as well as the rest of our docs, are open source and available on GitHub. We welcome your contributions.

Need support?

Our support engineers are available to help with service issues, billing, or account related questions, and can help troubleshoot build configurations. Contact our support engineers by opening a ticket.

You can also visit our support site to find support articles, community forums, and training resources.