Detect hardcoded secrets with GitGuardian

Developers are building features at an unprecedented speed using what they need from the software ecosystem. These ever-expanding options include open-source libraries and packages, SaaS tools, deployment systems, cloud services, and more. To keep things secure, we always need the same thing: a secret.

What is a secret?

Secrets are digital authentication credentials (API keys, certificates, and tokens) used in applications, services, or infrastructures. Just like a password (plus a device in case of MFA) is used to authenticate a person, a secret authenticates a system to enable interoperability.

Watch the video: What is a secret?

Why are secrets a problem in CI/CD environments?

Software engineers need to handle more and more credentials as they use CI/CD pipelines to deploy artifacts, apps, and infrastructure to multiple environments. There are many places where secrets can be insecurely exposed:

Source code
Build, test, or deployment CI/CD workflows
Container image layers
Runner console output

Leaked credentials aren’t just a security problem; rotating a leaked secret interrupts CI/CD workflows.

How do secrets end up in source code?

Two words: human error.

The vast majority of leaked credentials are mistakes and do not spring from malicious intent. Hardcoding credentials can be a temporary solution, and sometimes developers don’t realize Git actually keeps track of a deleted secret. New developers don’t know proper procedures, or a test is skipped. The list of possible mistakes is enormous.

Why are hardcoded secrets different than other types of vulnerabilities?

Unlike other vulnerabilities, which pinpoint a specific weakness in code, detecting secrets requires the whole codebase history of a project.

There are two possibilities when a developer mistakenly commits a secret: either the case is acknowledged, or it is not.

In the former case, one very common mistake would be to delete it and simply commit the change. The secret disappears from the current state of the source code, but it is still in the commit history!

In the latter case, it is likely the secret will reach the remote version control system (VCS). At that point, the secret would already be considered leaked (best case scenario, it would be detected at the code review stage, but the secret may already need to be rotated at that point).

It is not uncommon to find valid secrets hidden deep inside the codebase history. Secrets detection needs to take into account this attack surface and scan for incremental changes to the repository to prevent these kinds of leaks.

Getting started with GitGuardian

In this tutorial, you will learn how to add GitGuardian real-time monitoring to a CircleCI workflow to scan every new commit for secrets.

GitGuardian detects secrets in your repositories in the history or in incremental commits. Secrets detection occurs at multiple stages of the development lifecycle: on the developer’s local machine with pre-commit hooks or a pre-push hook, in a pre-receive hook or in a CI environment.

With the GitGuardian dashboard, visibility is enabled company-wide to secure all the repositories at once.

Company-wide perimeter

The dashboard also empowers developers and AppSec engineers to collaborate through the full remediation process. We will not cover this in the tutorial but you can learn more in the documentation.

Prerequisites

To follow this tutorial, you will need:

A CircleCI account
A GitHub account
A GitGuardian account

Fork the sample repository

In this tutorial, you will use a sample_secrets test repository from GitGuardian. This repository contains a variety of secrets for testing purposes. Fork it to your GitHub user account or to a GitHub organization where you are an admin.

Then, open the CircleCI Projects page, click the sample_secrets name, then select Faster: Commit a starter CI pipeline to a new branch.

This create the new branch circleci-project-setup in the repository, containing the demo workflow say-hello-workflow, configured by the ./circleci/config.yml file.

Create a GitGuardian API token

You need a GitGuardian API token to use the GitGuardian orb. From the GitGuardian dashboard, go to API > Personal access tokens and then click Create Token. Give the token a scan scope and a memorable name:

Create a token

Copy the token and keep it handy; it’s the only time you can view it.

Note: If you are under GitGuardian’s Business plan or the 30-day Business trial, create a service account instead of a personal access token. A service account is a special type of API key intended to represent a non-human user like a CI runner. To create one, go to API > Service accounts and follow the same steps.

From the CircleCI dashboard, click the sample_secrets project, then Project settings > Environment Variables. Click Add Environment Variable. Name it GITGUARDIAN_API_KEY, and give it the same value as the token you copied earlier.

Scan incremental changes with ggshield

You now need to add a workflow in your CircleCI config.yml to use the ggshield orb.

Copy and replace the file with this:

version: 2.1

orbs:
  ggshield: gitguardian/ggshield@volatile

workflows:
  scan_my_commits:
    jobs:
      - ggshield/scan:
          name: ggshield-scan
          base_revision: <<pipeline.git.base_revision>>
          revision: <<pipeline.git.revision>>

You can also find this snippet on the ggshield orb registry page.

The base_revision and revision values will be populated when the pipeline is triggered:

base_revision is the commit ID of the first commit to scan.
revision is the ID of the last commit to scan.

In this configuration, only the latest commits are scanned, which is convenient for a CI pipeline. You might not want to scan the whole git history on every pipeline launch. The scan operates on all the commits since the last revision to ensure that no secrets were committed and then deleted.

When you are done with the config.yml file, commit it, push it, and go to the CircleCI dashboard to watch the pipeline launch. You may have to accept using third-party orbs in Organization settings > Security > Orb Security Settings if this is the first time you have used them.

Scan commits

Click the job to learn that there are Commits to scan: 1.

#!/bin/bash -eo pipefail

ggshield secret scan -v ci

CIRCLE_RANGE: dea39f827dfe23f06f4ea63d7fb16ab0c363db9d...90220851160dcf018f372536da223dc0396aa247
CIRCLE_SHA1: 90220851160dcf018f372536da223dc0396aa247
Commits to scan: 1
Scanning Commits---------------------------------]    0%Scanning Commits  [####################################]  100%
secrets-engine-version: 2.71.0
No secrets have been found
commit 90220851160dcf018f372536da223dc0396aa247
Author: ***
Date: ***

CircleCI received exit code 0

To verify the shield is working as expected, just commit a single change to one of the test repository’s files. For example, open the sample_secrets/bucket_s3.py file and append or remove trailing whitespace, then commit this change (be sure to be on the circle-project-setup branch).

This will fail because ggshield will scan the latest commit and detect two secrets in the file:

#!/bin/bash -eo pipefail
ggshield secret scan -v ci
CIRCLE_RANGE: 90220851160dcf018f372536da223dc0396aa247...2d08c13226628ecfb3ee9a07001c185915a84adf
CIRCLE_SHA1: 2d08c13226628ecfb3ee9a07001c185915a84adf
Commits to scan: 1
Scanning Commits---------------------------------]    0%Scanning Commits  [####################################]  100%

secrets-engine-version: 2.71.0

commit 2d08c13226628ecfb3ee9a07001c185915a84adf
Author: XXXX
Date: XXXX

🛡️  ⚔️  🛡️  2 incidents have been found in file bucket_s3.py

>>> Incident 1(Secrets detection): AWS Keys (Validity: Invalid)  (Ignore with SHA: 9f2785cab705507aaea637b8b38d8e1ff9ce8a4334dda586187cbb018ed33163) (1 occurrence)
 8  8 |
 9  9 | def aws_upload(data: Dict):
10    |     database = aws_lib.connect("AKIA************WSZ5", "hjshnk5**************************89sjkja") |_____client_id____|
10    |     database = aws_lib.connect("AKIA************WSZ5", "hjshnk5**************************89sjkja")
10 |     database = aws_lib.connect("AKIA************WSZ5", "hjshnk5**************************89sjkjb")
11 11 |     database.push(data)

>>> Incident 2(Secrets detection): AWS Keys (Validity: Invalid)  (Ignore with SHA: e8077f59453457d2b3d980be4d8655eaa901c7aa8810a6079b429477e07a57f9) (1 occurrence)
 9  9 | def aws_upload(data: Dict):
10    |     database = aws_lib.connect("AKIA************WSZ5", "hjshnk5**************************89sjkja")
10 |     database = aws_lib.connect("AKIA************WSZ5", "hjshnk5**************************89sjkjb")
|_____client_id____|
10 |     database = aws_lib.connect("AKIA************WSZ5", "hjshnk5**************************89sjkjb")
|_____________client_secret____________|
11 11 |     database.push(data)

Exited with code exit status 1
CircleCI received exit code 1

Validity: Invalid tells you two things:

The secret could be checked (this is not always the case).
The secret isn’t valid anymore.

GitGuardian validity checks docs

Going further: scanning the commit history

But what if you would like to scan all past commits for secrets? The historical scan was done for you by GitGuardian when you forked the sample_secrets repository (this is the default behavior).

Go to your GitGuardian dashboard and search for the sample_secrets source on the Perimeter page. You should see that GitGuardian detected nine open secret incidents in the repository.

Scan commits

If needed, you can Scan the selected source again.

Open secrets

Click the source to display the Table of secrets. Incidents detected during a historical scan are tagged.

Table of secrets

You can scan any arbitrary git history with the command ggshield scan repo, but there is no dedicated orb for it.

Going further: remediation and developer workflow

If you made it this far, congratulations! You can be sure that any secret committed to this repository would break the pipeline and be reported in the dashboard, along with all the other past incidents. You can read more about how to leverage them to assign incidents, collaborate, and organize the cleaning of your repositories’ leaked secrets.

Here is a recommendation we give to all GitGuardian users: prevention will always be preferable to remediation, so aim at integrating secrets detection as early as possible in the developer workflow.

To understand why, imagine for a moment that a widely used secret is detected in the CircleCI workflow by GitGuardian. Best practice would be to immediately revoke and rotate it as if it was compromised, even if it wasn’t. But the truth is that rotating a secret is almost always a tough job. It could mean workflow interruptions for many people. It could cause unexpected failures all along the CI/CD chain, or even in production.

That’s why we always advocate for integrating GitGuardian in the developer workflow with ggshield as a pre-commit, pre-push (client-side), or pre-receive (server-side) hook, making sure no secret can reach the version control system in the first place.

You can also integrate GitGuardian natively into source control management platforms:

Conclusion

This tutorial demonstrated how easily secrets can be leaked. Unlike runtime vulnerabilities, leaked secrets can persist in old commits and represent a real threat. That’s why using a secrets detector in your CI workflows is a must-have for code security.

This awareness is an essential first step toward building a culture of shared responsibility between security, operations, and developers for preventing production issues, keeping pipelines running, and remediating issues as soon as possible.

Want to learn more? Visit the GitGuardian documentation and blog for best practices, cheat sheets, and much more.

Site

Blog