Getting started with Codex and CircleCI
Content Marketing Manager
Getting started with Codex and CircleCI
Codex is OpenAI’s coding agent, powered by the GPT-5 family of models. It reads your files, proposes edits, and runs commands directly in your local environment. It ships as both a desktop app and an open source CLI, and it extends through plugins that connect it to external tools and services.
Like any AI coding tool, Codex is strongest when the code it generates gets validated automatically. A function that looks right in the editor can still break a downstream integration, fail a lint rule, or introduce a regression that only surfaces under test. Continuous integration catches those problems before they ship, making it a natural complement to any AI coding workflow.
This tutorial walks through setting up Codex, using it on a real codebase, then adding the CircleCI plugin so the same session can build and test your app, diagnose and fix bugs, and even hand off larger maintenance tasks to CircleCI’s autonomous agent Chunk. The examples use the Codex CLI, but you can use the desktop app to get the same results.
Prerequisites
For the Codex portion:
- Node.js 18 or newer
- A sign-in for Codex: any current ChatGPT plan, including Free and Go, includes Codex access with usage limits. An OpenAI API key also works for CLI auth.
For the CircleCI plugin portion (set up partway through):
- A free CircleCI account
- A GitHub account
- The CircleCI local CLI (install steps below)
- The demo repo used in this tutorial: chunk-dataforge-demo
Installing Codex
Install with npm:
npm i -g @openai/codex
Then start it:
codex
Codex prompts for sign-in on first run. ChatGPT account or API key both work. The CLI runs on macOS, Linux, and Windows.
Getting comfortable with Codex
The terminal interface
Running codex opens a full-screen terminal UI. Beyond chat, the interface is an interactive environment: Codex reads code, proposes changes, and runs commands subject to the active approval rules.
Sandbox and approval controls
Codex separates autonomy into two settings:
sandbox_modedefines what Codex can do technically:read-only,workspace-write, ordanger-full-access.approval_policydefines when Codex pauses to ask:untrusted,on-request, ornever.
The default Auto preset combines sandbox_mode = "workspace-write" with approval_policy = "on-request". Codex can read files, make edits, and run commands inside the working directory without prompting, but asks before writing outside the workspace or using the network. Switching modes mid-session is a /permissions away.
Useful slash commands
/pluginsopens the plugin browser (covered later)/skillslists skills currently available in the session/modelswitches between available models/permissionsadjusts the active sandbox and approval policy/statusshows current model, usage limits, and active mode/reviewruns a separate Codex agent against the working tree
A first session: exploring a real codebase
With Codex installed, the next step is using it on actual code. CircleCI’s DataForge demo is a good first target: a small Node.js data-processing utility with a Jest test suite. It’ll also serve as the project for the CircleCI plugin section later, so this is a one-time clone.
git clone https://github.com/CIRCLECI-GWP/chunk-dataforge-demo
cd chunk-dataforge-demo
codex
A few prompts that show how Codex responds to different kinds of work:
Read-only exploration:
Explain this codebase. What are the main modules and how do they relate?
Codex reads the project structure, opens key files, and returns a summary. No file changes happen here, so the sandbox stays out of the way.
Targeted reading:
What does processBatch do? Walk through it line by line.
Codex finds the function in src/processors/batch.js and explains it. This is the workflow for getting up to speed on unfamiliar code without an hour of manual reading.
Proposing edits:
Add a JSDoc comment block to processBatch describing its arguments and return value.
This time Codex proposes a file change. The default Auto preset prints the diff and asks for approval before writing. Accepting the change writes it; rejecting leaves the file untouched. The approval prompt is the security boundary. Nothing reaches disk without an explicit yes.
Running tests:
Run the test suite and tell me which tests pass.
Codex runs npm test inside the workspace, captures the output, and summarizes. Test runners, linters, and build commands all work the same way.
That’s the core loop: read, propose, run, approve. Everything else Codex does is a variation on it.
Extending Codex with plugins
Codex’s built-in capabilities cover general coding work well. For domain-specific tasks like interacting with Figma designs, GitHub issues, Linear tickets, or CI/CD systems, plugins extend the agent with skills and tools tailored to that domain.
A plugin can bundle:
- Skills: instruction files (
SKILL.md) that teach Codex how to handle specific tasks. Codex loads them lazily based on the prompt, so a plugin with ten skills only consumes context for the one being used. - MCP servers: external services that show up as tools alongside Codex’s built-ins.
- App integrations: connectors to third-party apps.
The rest of this tutorial focuses on one plugin in depth: the CircleCI plugin, which turns Codex into a CI/CD copilot.
Setting up the CircleCI plugin
Connect the demo repo to CircleCI
First, push the cloned demo to a fork so CircleCI has something to run pipelines against:
- Fork github.com/CIRCLECI-GWP/chunk-dataforge-demo on GitHub
- Update the local clone’s remote to point at the fork:
git remote set-url origin <fork-url> - In CircleCI, head to Projects and find the fork
- Click Set Up Project and let the initial pipeline run
Whether that first build passes or fails doesn’t matter. The point is generating some pipeline data the plugin can query.
Install and authenticate the CircleCI CLI
The plugin drives many operations through the local CircleCI CLI.
macOS:
brew install circleci
Linux:
curl -fLSs https://raw.githubusercontent.com/CircleCI-Public/circleci-cli/main/install.sh | bash
Then authenticate with a Personal API Token:
- In CircleCI, click the avatar in the top right corner
- Open User Settings → Personal API Tokens
- Click Create New Token, name it something like
codex-cli-key, and copy the value - Run
circleci setupand paste the token when prompted
Verify:
circleci --version
Install the plugin in Codex
Back in Codex, open the plugin browser:
/plugins
Find CircleCI in the directory, select Install, then restart Codex so the install takes effect. After restart, /skills should show four CircleCI skills:
circleci-buildsfor pipeline status, failed-build diagnosis, and flaky-test investigationcircleci-configfor reviewing and improving.circleci/config.ymlcircleci-clifor local validation, pipeline triggers, and rerunschunkfor handing off autonomous maintenance tasks to Chunk
Many of these capabilities are also available through the CircleCI MCP server, which works with other agentic tools like Cursor, Claude Code, and Windsurf. The plugin is the easiest path inside Codex. The MCP server is the option for teams standardizing across multiple AI tools.
Using the CircleCI plugin
The plugin is invoked with @circleci followed by a natural-language request:
@circleci check the latest pipeline run
Codex routes the prompt through the relevant skill. Pipeline status questions route to circleci-builds. CLI operations route to circleci-cli. Implicit routing also works: a prompt like “what failed in my last build?” reaches the right skill without an explicit @circleci mention, since each skill’s description matches the intent.
Checking the CircleCI web interface from our browser, we can confirm that the pipeline is green and the information Codex is reporting is correct:
Validating config before pushing
@circleci review the CircleCI config for this repo
The circleci-config skill runs circleci config validate and surfaces syntax errors, semantic issues, and recommendations with line numbers. That beats discovering a YAML typo after a push triggers a failed build.
The same skill handles optimization questions:
@circleci suggest improvements to my CircleCI config for faster builds
Typical suggestions cover caching, parallelism, test splitting, and image choices.
Triggering pipelines from the terminal
@circleci run a pipeline on my current branch
The circleci-cli skill reads the Git remote and branch, triggers the pipeline through the CLI, and returns a monitoring link. You can also use more targeted prompt if you want specific changes:
@circleci trigger a pipeline on the staging branch
@circleci rerun the last failed workflow
Investigating build failures
@circleci diagnose the latest failing build
The circleci-builds skill pulls logs from the failed job, classifies the failure as transient or deterministic, identifies the step that broke, and recommends targeted fixes or reruns. With access to both the failure and the code in the same session, Codex can apply the fix directly.
A natural follow-up extends the workflow into a closed loop:
@circleci fix that issue and run the pipeline again. Let me know when it passes.
Codex edits the code, commits, triggers a new build, and reports back. If there is another failure, Codex will continue iterating through different changes until it finds a fix and the pipeline comes back green.
For a specific pipeline that needs a closer look, paste the URL:
@circleci look at this pipeline: https://app.circleci.com/pipelines/github/my-org/my-repo/789
Handing harder work to Chunk
Some maintenance jobs are too big for a single Codex turn: stabilizing every flaky test in a suite, generating missing test coverage across a service, optimizing a config file end-to-end. The chunk skill in the plugin hands those off to Chunk, CircleCI’s autonomous CI agent. Chunk runs inside CircleCI’s infrastructure, reads the repo, makes changes, validates them against the test suite, and opens a PR when the pipeline passes. No local environment is required.
One-time Chunk setup
Chunk is enabled per organization in the CircleCI web app, not in Codex:
- Navigate to the CircleCI web app at https://app.circleci.com and select Chunk from the sidebar and click Get Started
- Install the CircleCI GitHub App if the org isn’t already on it
- Provide an OpenAI or Anthropic API key (Chunk uses bring-your-own-key)
The DataForge demo includes a .circleci/cci-agent-setup.yml describing how to install dependencies, which Chunk uses to prepare a clean environment before each task. Existing projects can add one manually or let Chunk generate it.
Asking Chunk from Codex
Once Chunk is set up, the plugin can dispatch tasks through plain prompts. Take flaky tests as an example:
@circleci ask Chunk to fix any flaky tests in this project
That single prompt starts a Chunk task on CircleCI. Chunk reads the test history collected from past pipeline runs, identifies tests that have failed inconsistently, traces them to their source files, generates fixes, validates the changes by running the actual pipeline, and opens a PR if the build passes. Progress streams back into Codex while the work runs. The session can end without interrupting the task. CircleCI completes it in the background.
Flaky test repair is one example. Chunk can be prompted to do most CI/CD maintenance work the same way, including generating missing tests, refactoring across modules, optimizing pipeline configuration, and fixing bugs surfaced by build failures.
Resuming sessions
If you close out of the Codex terminal but need you resume your work later, you can recall the entire session through this command:
codex resume --last
Resumes the most recent session in the current working directory with the full transcript and approvals intact. Adding --all includes sessions from other directories.
Troubleshooting the Codex CLI and CircleCI skill
Here are a few common issues users often encounter when setting up the Codex CLI and getting the CircleCI skill running.
Plugin doesn’t show in /skills
Restart Codex. The plugin install requires a fresh session for skill metadata to load.
@circleci returns CLI errors
The plugin routes most operations through the local circleci binary. Two checks:
- Confirm the CLI is on PATH:
circleci --versionshould print a version. - Confirm the CLI is authenticated:
circleci diagnosticshould report a valid token. If not, runcircleci setupagain.
Plugin can’t find the project
If a query like “what’s the status of my latest pipeline” returns nothing:
- Confirm the project is followed in the CircleCI dashboard
- Check the Git remote:
git remote -v - Make sure the API token used for
circleci setuphas access to that project - Specify the project explicitly:
@circleci get build status for gh/my-org/my-repo
Chunk task fails to start
Chunk needs the CircleCI GitHub App (not just OAuth) installed on the org, plus a valid AI provider API key stored in the circleci-agents context. The Chunk setup screen in the CircleCI web app surfaces missing pieces.
Authentication errors from the CLI
- Confirm the token under User Settings → Personal API Tokens is still valid
- Regenerate and rerun
circleci setupif needed - Strip any trailing whitespace or newlines from the token value
Bringing it together
Codex covers what happens at the keyboard. CircleCI covers what happens after the push: running test suites, validating builds, shipping changes. The plugin keeps both sides accessible from one terminal session, and Chunk handles the maintenance work that doesn’t belong in a foreground prompt.
The fastest way to put any of this into practice is on a real project. Sign up for a free CircleCI account and connect a repo to start running pipelines today.