Start Building for Free
CircleCI.comAcademyBlogCommunitySupport

Scaling self-hosted runners

2 months ago1 min read
Cloud
Server v4.x
Server v3.x
On This Page

Introduction

Maintaining a fixed compute fleet of self-hosted runners can incur unnecessary costs as the workload can fluctuate depending on the rate at which jobs are queued. To help reduce this cost, the compute fleet can be scaled according to the demand.

You can view an example tutorial for scaling machine runners with AWS AutoScaling groups on the CircleCI blog.

Container runner

If you’re using the container runner on Kubernetes, it will automatically spin up more pods as more work enters the queue. Those pods are ephemeral and will be torn down after the job is done executing. While pods will scale with the work automatically, CircleCI will not go scale the underlying compute for your Kubernetes cluster.

Scaling data

There are several API endpoints to help you set up a solution to scale machine runners or your Kubernetes cluster for container runners:

A scaling solution can use the above endpoints to calculate the total number of waiting tasks which can be run. The task data endpoints are scoped to a single resource class, so it’s important to query every available resource class to get the total number of running tasks.

If you’re using machine runners, you can devise a scaling solution to add more machine runners to the resource class that has more pending work.

Agent configuration for machine runner

There are some machine runner configuration settings which can be used by your scaling solution, particularly to assist resource cleanup when the demand drops:

  • Runner Mode

    • Choosing single-task mode will cause machine runner to shut down after a single task. This is useful if using completely ephemeral compute as the resources can be automatically recycled upon machine runner exit.

    • Choosing continuous mode will cause the machine runner to poll for new tasks after completing a task. Your scaling solution will need to monitor the task workload and actively shutdown unused machine runners.

  • Runner Idle Timeout

    • Setting a reasonable timeout can be used for automatic resource recycling during periods of lower demand.


Help make this document better

This guide, as well as the rest of our docs, are open source and available on GitHub. We welcome your contributions.

Need support?

Our support engineers are available to help with service issues, billing, or account related questions, and can help troubleshoot build configurations. Contact our support engineers by opening a ticket.

You can also visit our support site to find support articles, community forums, and training resources.