Start Building for Free

Container runner reference

6 days ago9 min read
Server v4.3+
On This Page

This document is a comprehensive guide to operating and configuring jobs with the CircleCI container runner.

Running your first job with container runner

Follow the instructions outlined on the Container runner installation page to download the container runner and run your first job. You can also use the CircleCI web app to get started with self-hosted runners.

Container runner sample configuration

version: 2.1

      - image: cimg/base:2021.11
    resource_class: <namespace>/<resource-class>
      - checkout
      - ...

      - build

Resource class configuration and custom task pod configuration

Container runner supports claiming and running tasks from multiple resource classes concurrently, as well as customization of the Kubernetes resources created to run tasks for a particular resource class. Configuration is provided by a map object in the Helm chart values.yaml.

Each resource class supports the following parameters:

  • token: The runner resource class token used to claim tasks (required).

  • Custom Kubernetes pod configuration for pods used to run CircleCI jobs.

The pod configuration takes all fields that a normal Kubernetes pod does. If service containers are used in a CircleCI job, the first container spec is used for all containers within the task pod. There is currently no way to provide a different container configuration between service containers and the main task container.

The following fields will be overwritten by container runner to ensure correct task function, and expected CircleCI configuration behavior:

  • spec.containers[0].name

  • spec.containers[0].container.image

  • spec.containers[0].container.args

  • spec.containers[0].container.command

  • spec.containers[0].container.workingDir

  • spec.restartPolicy


  • metadata.namespace

Below is a full configuration example, containing two resource classes:

      token: TOKEN1
          - resources:
                cpu: 500m
              - name: xyz
                mountPath: /path/to/mount
          runAsNonRoot: true
          - name: my_cred
          - name: xyz
            emptyDir: {}

      token: TOKEN2
          - name: "other"

Unsafe retries

Unsafe retries enable container runner to automatically rerun tasks that are unexpectedly interrupted during their execution. These disruptions could be due to network connectivity issues, the underlying node shutting down, or other unpredictable causes. Any job failure that would be displayed in the CircleCI web app as an infrastructure fail should be expected to trigger an unsafe retry when enabled.

Unsafe retries is useful when scheduling workloads on spot instances, which often come with cost-saving benefits at the risk of pod preemptions with many Kubernetes providers.

The following sequence shows how unsafe retires work:

  1. If a pod fails or gets evicted during runtime, container runner will release the task.

  2. All resources managed by container runner for the task, such as the Kubernetes pod and secret, are cleaned up and deleted.

  3. The released task then becomes available for reclaim by any container runner instance configured for the same resource class.

  4. Once reclaimed, the task is restarted completely from scratch, including previously run steps.

  5. A task can be retried up to 3 times before it is deemed to have permanently failed.

To enable unsafe retries, set the enableUnsafeRetries flag in the resource class configuration for each resource class. The following example shows two resource class definitions. Unsafe retries is enabled for the first, for spot instances, but not for the second resource class:

      enableUnsafeRetries: true
      token: your-resource-class-1-token
      # The following spec isn't required, but serves as an example of how you could schedule tasks on spot instances using tolerations for the node's taint
        - key: "lifecycle"
          operator: "Equal"
          value: "Ec2Spot"
          effect: "NoExecute"
      # Unsafe retries are disabled by default
      token: your-resource-class-2-token
      # This resource class can only schedule tasks on nodes without taints specific to spot instances


Container runner logs an event whenever a task encounters a runtime failure. The specific error message is provided under the error field within the service-work span. To check whether the task is set to be rerun or not (either because it cannot be retried or all retries have been exhausted), you can inspect the app.to_retry field. This boolean indicates the retry status of the task.

You can utilize these fields with your preferred Kubernetes logging integrations to monitor when and how frequently tasks are retried.

Custom token secret

Using the configuration described above provisions a Kubernetes secret containing your resource class tokens. In some circumstances, you may wish to provision your own secret, or you simply might not want to specify the tokens via Helm. Instead, you can provision your own Kubernetes secret containing your tokens and specify its name in the agent.customSecret field.

The secret should contain a field for each resource class, using the resource class name as the key and the token as the value. Consider the following resourceClasses configuration:


  customSecret: <name_of_secret>

The corresponding custom secret would have 2 fields:

circleci-runner.resourceClass: <my-token>
circleci-runner.resourceClass2: <my-token-2>

Due to Kubernetes secret key character constraints, the / separating the namespace and resource class name is replaced with a . character. Other than this, the name must exactly match the resourceClasses config to match the token with the correct configuration.

Even if there is no further pod configuration, the resource class must be present in resourceClasses as an empty map, as shown by circleci-runner/resourceClass2 in the above config example.

Additional instructions can be found in our Support Center.

Helm chart parameters

The container runner Helm chart is hosted here.

The following are CircleCI specific settings:



Runner API URL

A (preferably) unique name assigned to this particular container-agent instance. This name will appear in your Runner Inventory page in the CircleCI UI. If left unspecified, the name will default to the name of the deployment.

container-agent (the name of the deployment)

agent.resourceClasses Default must be updated in order to run a job successfully

Resource class task configuration. See the " Resource Class Configuration" section above.



A user provided Kubernetes containing resource class tokens. See the " Custom Token Secret" section above.



Termination grace period during container runner shutdown



Max task run time. This value should be shorter than the grace period above - See docs for potential values



Maximum number of tasks claimed/run concurrently



Option to enabled/disable garbage collection



Length of time pods can run before deleted by GC



Whether to enable the constraint checker



Number of failed checks before disabling resource class claim



The constraint check interval


The following is for Kubernetes object settings. All settings prefixed with agent below are for the container runner pod itself, not the ephemeral pods where jobs are executed.



Override the chart name



Override the full generated name



Number of container-agents to deploy.



Agent image registry



Agent image repository



Agent image pull policy



Agent image tag



Secret objects container private registry credentials for the container runner pod itself, not the ephemeral pods that execute tasks



Match labels used on agent pods

app: container-agent


Extra annotations added to agent pods



Security context policies added to agent pods



Security context policies add to agent containers



Custom resource specifications for container runner pods



Node selector for agent pods



Node tolerations for agent pods



Node tolerations for agent pods



Node affinity for agent pods



Autodetect the OS and CPU architecture of the node that the task pod is running on. If false, the node is assumed to be the same OS and CPU architecture as the container runner pod and cluster-wide permissions are unneeded.



Create a custom service account for the agent



Create a Role and RoleBinding for the service account



Image registry for logging containers



Image repository for logging containers



Image tag for logging containers



Create a custom service account token for logging containers



Create a Role and RoleBinding for logging containers


Container runner needs the following Kubernetes permissions:

  • Pods, Pods/Exec

    • Get

    • Watch

    • List

    • Create

    • Delete

  • Secrets

    • Get

    • List

    • Create

    • Delete

  • Events

    • Watch

  • Nodes

    • Get

    • List

If Rerun job with SSH is enabled, the following permissions are also required:

In addition, Logging containers require the following minimal permissions to get service container logs and stream them to the CircleCI web app:

  • Pods, Pods/Logs

    • Watch

By default a Role, RoleBinding and service account are created and attached to the container runner pod, but if you customize these, the above are the minimum required permissions.

It is assumed that the container runner is running in a Kubernetes namespace without any other workloads. It is possible that the agent or garbage collection (GC) could delete pods in the same namespace.

Garbage collection

Each container runner has a garbage collector which will ensure any pods and secrets with the label left dangling in the cluster are removed. By default this will remove all jobs older than five hours and five minutes. This can be shortened or lengthened via the agent.kubeGCThreshold parameter. However, if you do shorten the garbage collection (GC) frequency, also shorten the max task run time via the agent.maxRunTime parameter to be a value smaller than the new GC frequency. Otherwise a running task pod could be removed by the GC.

Container runner will drain and restart cleanly when sent a termination signal. Container runner will not automatically attempt to launch a task that fails to start. This can be done in the CircleCI web app.

If the container runner crashes, there is no expectation that in-process or queued tasks are handled gracefully.

Logging containers

Container runner schedules a logging container if there are secondary (service) containers in the task pod. This container will get the secondary container logs and stream them to the steps UI in the CircleCI web app. Task agent, which runs in the primary container, is responsible for streaming all other step output to the CircleCI web app. The only exception is the Task lifecycle step, which is streamed by container runner itself.

Logging containers require a service account token with the minimal privileges to get container logs.

Container runner currently sets default resource limits and requests on the logging container, they are:

  cpu: 50m
  memory: 64Mi
  cpu: 100m
  memory: 128Mi

Constraint validation

Container runner allows you to configure task pods with the full range of Kubernetes settings. This means pods can potentially be configured in a way which cannot be scheduled due to their constraints. To help with this, container runner has a constraint checker which periodically validates each resource class configuration against the current state of the cluster, to ensure pods can be scheduled. This prevents container runner claiming jobs which it cannot schedule which would then fail.

If the constraint checker fails too many checks, it will disable claiming for that resource class until the checks start to pass again.

Currently the following constraints are checked against the cluster state:

As an example of how this works, consider the following resource class configuration:

      token: TOKEN1
          disktype: ssd

      token: TOKEN2

The first resource class has a node selector to ensure it is scheduled to nodes with an SSD. For some reason during operations the cluster no longer has any nodes with that label. The constraint checker will now fail checks for circleci-runner/resourceClass and will disable claiming jobs until it finds nodes with the correct label again. circleci-runner/resourceClass2 claiming is not affected, the checks for different resource classes are independent of each other.

Cost and availability

Container runner jobs are eligible for Runner Network Egress. This is in line with the existing pricing model for self-hosted runners, and will happen with close adherence to the rest of CircleCI’s network and storage billing roll-out. If there are questions, reach out to your point of contact at CircleCI.

The same plan-based offerings for self-hosted runner concurrency limits apply to the container runner. Final pricing and plan availability will be announced closer to the general availability of the offering.

Building container images

Docker in Docker is not recommended due to the security risk it can pose to your cluster.

To build container images in a container-agent job, a user may use:

  • A third-party tool like Buildah or kaniko

  • Machine runner installed with Docker installed on it

  • CircleCI-hosted compute

Note: Third-party tools should be used at your own discretion.

While jobs that run with container-agent cannot use CircleCI’s setup_remote_docker feature, it is possible to use a third-party tool to build Docker images in your container-agent job without using the Docker daemon.

You can see an example on our community forum of how some users have successfully used kaniko to build a container image.

Another option is to use a tool called Buildah. Buildah can be used in your .circleci/config.yml syntax:

  - image:

Using the Buildah image

Buildah relies on the fuse-overlay program inside of the container, which means that a fuse device plugin must be configured in order to use it. /dev/fuse is required to use fuse-overlayfs inside of the container, as this option tells Buildah on the host to add /dev/fuse to the container for Buildah’s use. Kubernetes has a device plugin system to enable secure sharing of host devices with pods.

To install the configuration dev/fuse, clone this repository to where you are running Helm commands for your container-agent deployment. Then run:

kubectl create -f fuse-device-plugin-k8s-1.16.yml

You can confirm that this has been configured correctly by running kubectl get daemonset -n kube-system and confirming that fuse-device-plugin-daemonset is present and ready.

Once this device has been added, update the container-agent resource class configuration:

  token: <token>
     - resources:
        limits: 1

This will now let you run Buildah commands with container agent jobs and build containers:

      - image:
    resource_class: <namespace>/<resourceClass>
      - checkout
      - run:
          name: sanity-test
          command: |
            buildah version
      - run:
          name: Building-a-container
          command: |
            buildah bud -f ./Dockerfile -t myimage:0.1
            buildah push myimage:tag

Using Buildah with custom images

You can also build your own custom image and include the installation of Buildah in your Dockerfile:

sudo yum install buildah

If you plan to use a CircleCI convenience image, ensure you add the repository for installation to your job’s steps:

sudo apt-get update
sudo apt-get install -y wget ca-certificates gnupg2
VERSION_ID=$(lsb_release -r | cut -f2)
echo "deb${VERSION_ID}/ /" | sudo tee /etc/apt/sources.list.d/devel-kubic-libcontainers-stable.list
curl -Ls$VERSION_ID/Release.key | sudo apt-key add -
sudo apt-get update
sudo apt install buildah -y

Additionally, set the isolation variable to default to chroot:

# Default to isolate the filesystem with chroot.

You can then follow the same instructions as Using the Buildah image above to add the fuse device plugin to the container-agent deployment and update your .circleci/config.yml file to use your custom images and build container images in those jobs.


  • Any known limitation for the existing self-hosted runner will continue to be a limitation of container agent.

  • There is no support for container environments other than Kubernetes at this time.

  • setup_remote_docker as a command is not supported with container runner. See Building Container Images.


Visit the runner FAQ page to see commonly asked questions about container runner.

Suggest an edit to this page

Make a contribution
Learn how to contribute