Resource consumption and cost management

Server 4.9 Server Admin
Contribute Go to Code

Understanding how CircleCI Server consumes cloud resources helps you plan capacity, set appropriate budgets, and detect cost anomalies.

Resource cost drivers

CircleCI Server consumes cloud resources across several categories:

Compute resources

Table 1. Compute resources overview
Resource Component Description Cost impact

Nomad worker nodes (ASG/MIG)

Execution environment

EC2/GCE instances that run Docker executor jobs

Highest - scales with job volume

Machine executor jobs

Machine Provisioner

On-demand instances for machine executor jobs

High - billed per job duration

Kubernetes nodes

Control plane

EKS/GKE nodes running CircleCI services

Medium - stable

Nomad workers

Nomad workers are the primary compute cost driver. They scale up when jobs are queued and scale down when idle, controlled by the Nomad Autoscaler.

Cost factors include:

  • Instance type configured in your Terraform module.

  • Number of concurrent jobs.

  • Job duration.

  • Autoscaler configuration (min, max, cooldown).

If the Nomad Autoscaler fails to scale down (for example, due to nodes stuck in draining state), compute costs can increase. Monitor autoscaler logs and ASG/MIG instance counts.

Machine executor jobs

Machine Provisioner creates on-demand instances for jobs using the machine executor. Each job spawns a dedicated instance that is terminated after job completion.

Cost factors include:

  • Number of machine executor jobs.

  • Job duration.

  • Instance type configured in values.yaml.

Storage resources

Table 2. Storage resources overview
Resource Description Cost impact

Object storage (S3/GCS)

Build artifacts, caches, workspaces

Medium - grows over time

Block storage (EBS/Persistent Disk)

Root volumes for Nomad workers, database storage

Low to medium

CircleCI uses object storage for:

  • Build artifacts - Files uploaded via store_artifacts.

  • Test results - JUnit XML files from store_test_results.

  • Caches - Dependency caches from save_cache/restore_cache.

  • Workspaces - Data shared between jobs via persist_to_workspace/attach_workspace.

  • Action logs - Step output logs displayed in the UI.

  • Workflow configuration - Pipeline and workflow definition data.

  • Audit logs - System audit events.

Configure lifecycle policies for your object storage bucket to automatically expire old objects and control storage costs. See Data Retention in Server for details on configuring retention periods and S3 lifecycle policies.

Resource tagging

Proper resource tagging enables accurate cost attribution in your cloud provider’s cost management tools.

Auto-tagged resources

CircleCI automatically tags certain resources:

Table 3. Auto-tagged resources
Resource Tag/Label Value

Machine executor instances (AWS)

ManagedBy

circleci-machine-provisioner

Machine executor instances (AWS)

ResourceClass

For example, medium, large

Machine executor instances (GCP)

managed-by

circleci-machine-provisioner

Nomad workers (AWS)

vendor

circleci

Nomad workers (AWS)

nomad-environment

Your configured base name

Nomad workers (GCP)

Network tag

<name>-circleci-nomad-clients

Configurable tags

Machine executor tags (AWS)

You can add custom tags to machine executor instances via values.yaml:

machine_provisioner:
  providers:
    ec2:
      tags:
        key1: "value1"
        key2: "value2"

Nomad worker tags

Nomad worker tags are configured in the Terraform module via the instance_tags variable:

instance_tags = {
  "vendor"      = "circleci"
  "environment" = "production"
  "team"        = "platform"
}

Cost monitoring

We recommend using your cloud provider’s native cost management tools to monitor CircleCI Server resource consumption.

AWS

Filter by tag ManagedBy: circleci-machine-provisioner to track machine executor costs separately. Filter by tag vendor: circleci for Nomad worker costs.

GCP

Common cost issues

The following scenarios can lead to unexpected costs:

  • Instances not scaling down - Cloud provider API timeouts, rate limiting, or transient errors can prevent scale-in operations from completing, leaving instances running longer than expected.

  • Orphaned compute instances - Cloud provider API failures during instance termination can result in instances that continue running after their associated jobs complete.

  • Storage growth - Build artifacts and caches accumulate over time. Configure lifecycle policies for your object storage bucket to automatically expire old data.