CircleCI Server Container Architecture

This document outlines the containerized services that run on the Services machine within a CircleCI Server installation. This is provided both to give an overview of service operation, and to help with troubleshooting in the event of service outages. Supplementary notes and a key are provided below the following table.


  • Database migrator services are listed here with a low failure severity as they only run at startup, however:

    If migrator services are down at startup connected services will fail.
  • With a platinum support contract some services can be externalized (marked with * here) and managed to suit your requirements. Externalization provides higher data security and allows for redundancy to be built into your system.


Icon Description

Failure has a minor affect on production - no loss of data or functioning.

Failure might cause issues with some jobs, but no loss of data.

Failure can cause loss of data, corruption of jobs/workflows, major loss of functionality.

Containers, Roles, Failure Modes and Startup Dependencies

Container / Image Role What happens if it fails? Failure severity Startup dependencies


Provides a GraphQL API that provides much of the data to render the web frontend.

Many parts of the UI (e.g. Contexts) will fail completely.

postgres, frontend, contexts-service-migrator, contexts-service, vault-cci


Persists audit log events to blob storage for long term storage.

Some events may not be recorded.

postgres, frontend


Stores and provides encrypted contexts.

All builds using Contexts will fail.

postgres, frontend, contexts-service-migrator, vault-cci


Runs postgresql migrations for the contexts-service.

Only runs at startup.

postgres, frontend


Triggers scheduled workflows.

Scheduled workflows will not run.

postgres, frontend, cron-service-migrator


Runs postgresql migrations for the cron-service.

Only runs at startup.

postgres, frontend


Stores and provides information about our domain model.

Workflows will fail to start and some REST API calls may fail causing 500 errors in the CircleCI UI. If LDAP authentication is in use, all logins will fail.

postgres, frontend, domain-service-migrator


Runs postgresql migrations for the domain-service.

Only runs at startup.

postgres, frontend


Mail Transfer Agent (MTA) used to send all outbound SMTP.

No email notifications will be sent.



Stores user identities (LDAP).

If LDAP authentication is in use, all logins will fail and some REST API calls might fail.

only if LDAP in use

postgres, frontend, federations-service-migrator


Runs postgresql migrations for the federations-service.

Only runs at startup.

postgres, frontend


File storage service used as a replacement for S3 when CircleCI Server is run outside of AWS. Not used if Server is configured to use S3. Stores step output logs, artifacts, test results, caches and workspaces.

If not using S3, builds will produce no outputand some REST API calls might fail.

if not using S3



CircleCI web app and www-api proxy.

The UI and REST API will be unavailable and no jobs will be triggered by GitHub/Enterprise. Running builds will be OK but no updates will be seen.


mongo *

Mongo data store.

Potential total data loss. All running builds will fail and the UI will not work.



Queries the nomad server for stats and sends them to statsd.

Nomad metrics will be lost, but everything else should run as normal.


output-processor / output-processing

Receives job output & status updates and writes them to MongoDB. Also provides an API to running jobs to access caches, workspaces, store caches, workspaces, artifacts, & test results.

All running builds will either fail or be left in an unfixable, inconsistent state. There will also be data loss in terms of step output, test results and artifacts.



Provides the CircleCI permissions interface.

Workflows will fail to start and some REST API calls may fail, causing 500 errors in the UI.

postgres, frontend, permissions-service-migrator


Runs postgresql migrations for the permissions-service

Only runs at startup.

postgres, frontend


Splits a job into tasks and sends them to schedulerer to be run.

No jobs will be sent to Nomad, the run queue will increase in size but there should be no meaningful loss of data.


postgres / postgres-script-enhance *

Basic postgresql with enhancements for creating required databases when containers are launched.

Potential total data loss. All running builds will fail and the UI will not work.


rabbitmq / rabbitmq-delayed *

Runs the RabbitMQ server. Most of our services use RabbitMQ for queueing.

Potential total data loss. All running builds will fail and the UI will not work.


outputRunningRedis / redis *

The Redis key/value store.

Lose output from currently-running job steps. API calls out to GitHub may also fail.



Sends tasks to server-nomad to run. \

No jobs will be sent to Nomad, the run queue will increase in size but there should be no meaningful loss of data.


mongodb-upgrader / server-mongo-upgrader

Used to run any mongo conversion/upgrade scripts during mongo version upgrade.

Not required to run all the time. \


nomad_server / server-nomad *

Nomad primary service.

No 2.0 build jobs will run.


ready-agent / server-ready-agent

Called by Replicated to check whether other containers are ready.

Only required on startup. If unavailable on startup the whole system will fail.



Sends the user count to the internal CircleCI “phone home” endpoint.

CircleCI will not receive usage stats for your install but no affect on operation.



Checks the frontend container for 1.0 Builder shutdown requests. If a request is found, the 1.0 Builder is shut down.

1.0 Builder lifecycles will not be properly managed, but jobs will continue to run.



Provides real-time events to the CircleCI app.

Live UI updates will stop but hard refreshes will still work.



This is the statsd forwarding agent that our local services write to and can be configured to forward to an external metrics service.

Metics will stop working but jobs will continue to run.



Used to manage log rotations for all containers on the services machine.

If this stays down for a long period the Services machine disk will eventually run out of space and other services will fail.



Parses test result files and stores data.

There will be no test failure or timing data for jobs, but this will be back-filled once the service is restarted.


contexts-vault / vault-cci *

Instance of Hashicorp’s Vault – an encryption service that provides key-management, secure storage, and other encryption related services. Used to handle the encryption and key store for the contexts-service.

contexts-service will stop working, and all jobs that use contexts-service will fail.



Periodically check for stale machine and remote Docker instances and request that vm-service remove them.

Old vm-service instances might not be destroyed until this service is restarted.



Periodically requests that vm-service provision more instances for running machine and remote Docker jobs.

VM instances for machine and Remote Docker might not be provisioned causing you to run out of capacity to run jobs with these executors.



Inventory of available vm-service instances, and provisioning of new instances.

Jobs that use machine or remote Docker will fail.



Used to run database migrations for vm-service.

Only runs at startup.



Coordinates and provides information about workflows.

No new workflows will start, currently running workflows might end up in an inconsistent state, and some REST and GraphQL API requests will fail.

postgres, frontend, workflows-conductor-migrator


Runs postgreSQL migrations for the workflows-conductor.

Only runs on startup.

postgres, frontend

Help make this document better

This guide, as well as the rest of our docs, are open-source and available on GitHub. We welcome your contributions.