CircleCI Server v3.x Metrics and Monitoring

Metrics such as CPU or memory usage and internal metrics are useful in:

  • Quickly detecting incidents and abnormal behavior

  • Dynamically scaling compute resources

  • Retroactively understanding infrastructure-wide issues

Metrics Collection

Scope

Your CircleCI server installation collects a number of metrics and logs by default, which can be useful in monitoring the health of your system and debug issues with your installation.

Data is retained for a maximum of 15 days.
Prometheus Server is not limited to only scrape metrics from your CircleCI server install. It will scrape metrics from your entire cluster by default. You may disable Prometheus from within the kots admin console config if needed.

Prometheus

Prometheus is a leading monitoring and alerting system for Kubernetes. Server 3.x ships with basic implementation of monitoring common performance metrics.

Kots Admin - Metrics Graphs

By default, an instance of Prometheus is deployed with your CircleCI server install. Once deployed, you may provide the address for your Prometheus instance to the kots admin dashboard. Kots will use this address to generate graph data for the cpu and memory usage of containers in your cluster.

The default Prometheus address is http://prometheus-server

From the kots dashboard, select "configure graphs". Then enter http://prometheus-server and kots will generate resource usage graphs.

Telegraf

Most services running on server will report StatsD metrics to the Telegraf pod running in server. The configuration is fully customizable, so you can forward your metrics from Telegraf to any output that is supported by Telegraf via output plugins. By default, it will provide a metrics endpoint for Prometheus to scrape.

Use Telegraf to forward metrics to Datadog

The following example shows how to configure Telegraf to output metrics to Datadog:

Open up the management console dashboard and select Config from the menu bar. Locate the Custom Telegraf config section under Observability and monitoring. Here there is an editable text window where you can configure plugins for forwarding Telegraf metrics for your server installation. To forward to Datadog, add the following, substituting my-secret-key with your Datadog API key:

[[outputs.datadog]]
  ## Replace "my-secret-key" with Datadog API key
  apikey = "my-secret-key"

For more options, see the Influxdata docs.



Help make this document better

This guide, as well as the rest of our docs, are open-source and available on GitHub. We welcome your contributions.