CircleCI Server v3.x Overview

Introduction

CircleCI server is an on-premises CI/CD platform for enterprise customers who have compliance or security needs that require them to operate within their firewall, in a private cloud or data center.

Server offers the same features as CircleCI’s cloud offering, but operates within your Kubernetes cluster.

Services Architecture
Figure 1. CircleCI Server v3.x Architecture

The CircleCI server application exposes four services using load balancers. Three of these load balancers are VPC-internal for connecting to the Nomad cluster and virtual machines. If required, the frontend load balancer can be made private, separating it from the public internet. For further information see the Load Balancers doc.

Load Balancer Type Ports Description

Frontend GUI Proxy & API

External

80 and 443

Exposes the web application.

Nomad Control Plane

Internal

4647

Exposes an RPC protocol for Nomad runners.

Output Processor

Internal

8585

Ingests output from Nomad runners.

VM Service

Internal

3000

Provisions virtual machines.

The application exposes a number of external ports. These ports are used for various functions as defined in the table below.

Port number Protocol Direction Source / Destination Use Notes

80

TCP

Inbound

End users

HTTP web app traffic

443

TCP

Inbound

End users

HTTP web app traffic

8800

TCP

Inbound

Administrators

Admin console

22

TCP

Inbound

Administrators

SSH

Only required for the bastion host

64535-65535

TCP

Inbound

SSH into builds

Only required for the nomad clients.

CircleCI server schedules CI jobs using the Nomad scheduler. The Nomad control plane runs inside of Kubernetes, while the Nomad clients, which are responsible for running scheduled CircleCI jobs, are provisioned outside the cluster. CircleCI server can run Docker jobs on the Nomad clients themselves or in a dedicated virtual machine (VM).

Job artifacts and output are sent directly from jobs in Nomad to object storage (S3, GCS, or other supported options). Audit logs and other items from the application are also stored in object storage so both the Kubernetes cluster and the Nomad clients need access to object storage.

Services

CircleCI server 3.0 consists of the following services. Find their descriptions and failure implications below:

Service Component Description What happens if it fails? Notes

api-service

App Core

Provides a GraphQL API that provides much of the data to render the web frontend.

Many parts of the UI (e.g. Contexts) will fail completely.

audit-log-service

App Core

Persists audit log events to blob storage for long term storage.

Some events may not be recorded.

builds-service

App Core

Ingests from www-api and sends to plans-service, workflows-conductor, and to orbs-service

circle-legacy-dispatcher

Execution

Part of Compute Management. Sends to usage Q (mongo) and back to VCS.

circleci-mongodb

Execution

Primary datastore

circleci-postgres

Data storage for microservices

circleci-rabbitmq

Pipelines and Execution

Queuing for workflow messaging, test-results, usage, crons, output, notifications, and scheduler

circleci-redis

Execution

Cache data that will not be stored permanently (i.e. build logs), for request caching, and for rate limit calculations.

A failed cache can end up resulting in rate limiting from the VCS if too many calls are made to it.

circleci-telegraf

Telegraf collects statsd metrics. All white-boxed metrics in our services publish statsd metrics that are sent to telegraf, but can also be configured to be exported to other places (i.e. Datadog or Prometheus)

circleci-vault

HashiCorp Vault to run encryption and decryption as a service for secrets

config

contexts-service

App Core

Stores and provides encrypted contexts.

All builds using Contexts will fail.

cron-service

Pipelines

Triggers scheduled workflows.

Scheduled workflows will not run.

dispatcher

Execution

Split jobs into tasks and send them to scheduler to run.

No jobs will be sent to Nomad, the run queue will increase in size but there should be no meaningful loss of data.

domain-service

App Core

Stores and provides information about our domain model. Works with permissions and API

Workflows will fail to start and some REST API calls may fail causing 500 errors in the CircleCI UI. If LDAP authentication is in use, all logins will fail.

exim

Will be removed in GA, but users can provide mail submission credentials to an existing MTA

No email notifications will be sent.

federations-service

App Core

Stores user identities (LDAP). API and permissions-service

If LDAP authentication is in use, all logins will fail and some REST API calls might fail.

LDAP integration not available

frontend

Frontend

CircleCI web app and www-api proxy.

The UI and REST API will be unavailable and no jobs will be triggered by GitHub/Enterprise. Running builds will be OK but no updates will be seen.

Rate limit of 150 requests per second with a single user instantaneous limit of 300 requests.

inject-bottoken

A Kubernetes job that inserts a "bot token" into MongoDB. Bot tokens are authorization interservice communication. Mainly for www-api

kotsadm-kots

Licensing

The main Kots application. Runs the Kots admin console where upgrades and configuration of server take place No admin console available.

No upgrades or configuration possible for server

kotsadm-migrations

Licensing

Performs database migrations to handle updates of Kotsadm

kotsadm-minio

Licensing

Object storage for Kots licensing

kotsadm-operator

Licensing

Deploys and controls Kotsadm

kotsadm-postgres

Licensing

Database for Kots licensing

legacy-notifier

App Core

Handles notifications to external services (Slack, email, etc.)

prometheus

Server

Used for metrics

orb-service

Pipelines

Handles communication between orb registry and config.

output-processor

Execution

Receives job output & status updates and writes them to MongoDB. Also provides an API to running jobs to access caches, workspaces, store caches, workspaces, artifacts, & test results.

permissions-service

App Core

Provides the CircleCI permissions interface.

Workflows will fail to start and some REST API calls may fail, causing 500 errors in the UI.

scheduler

Execution

Runs tasks sent to it. Works with Nomad server.

No jobs will be sent to Nomad, the run queue will increase in size but there should be no meaningful loss of data.

server-troubleshooter

Data

Runs commands inside pods and appends output to support bundles.

May not be available in GA.

slanger

server

Provides real-time events to the CircleCI app.

Live UI updates will stop but hard refreshes will still work.

test-results

Execution

Parses test result files and stores data.

There will be no test failure or timing data for jobs, but this will be back-filled once the service is restarted.

vm-gc

Compute Management

Periodically check for stale machine and remote Docker instances and request that vm-service remove them.

Old vm-service instances might not be destroyed until this service is restarted.

vm-scaler

Machine

Periodically requests that vm-service provision more instances for running machine and remote Docker jobs.

VM instances for machine and Remote Docker might not be provisioned causing you to run out of capacity to run jobs with these executors.

Different overlay for EKS vs. GKE.

vm-service

Machine

Inventory of available vm-service instances, and provisioning of new instances.

Jobs that use machine or remote Docker will fail.

workflows-conductor-event-consumer

Pipelines

Takes in information from VCS to kick off pipelines.

New Pipelines will not be kicked off when there are changes in the VCS.

workflows-conductor-grpc-handler

Pipelines

Helps translate the information through gRPC.

web-ui-*

Frontend

Micro Front End (MFE) services used to render the frontend web application GUI.

The respective services page will fail to load. Example: A web-ui-server-admin failure means the server Admin page will fail to load.

The MFE’s are used to render the web application located at app.<my domain here>

Platforms

CircleCI server is designed to deploy within a Kubernetes cluster. Virtual machine service (VM Service) is able to leverage unique EKS or GKE offerings to dynamically create VM images.

If installing outside of EKS or GKE, additional work is required to get access to some of the same machine build features. Setting up CircleCI runners will give you access to the same feature set as VM service across a much wider range of OSs and machine types (MacOS and much more).

We do our best to support a wide range of platforms for installation. We use environment-agnostic solutions whenever possible. However, we do not test all platforms and options. For that reason we provide a list of tested environments, which we will continue to expand over time. We will be adding OpenShift to our list of regularly tested and supported platforms.

Environment Status Notes

EKS

Tested

GKE

Tested

Azure

Untested

Should work with Minio Azure gateway and Runner

Digital Ocean

Untested

Should work with Minio Digital Ocean gateway and Runner

OpenShift*

Untested

Known to not work

Rancher

Untested

Should work with Minio and Runner



Help make this document better

This guide, as well as the rest of our docs, are open-source and available on GitHub. We welcome your contributions.