Using Workspaces to Share Data between Jobs
On This Page
Workflows each have an associated
workspace. Workspaces are used to transfer data to downstream jobs as the workflow progresses.
Use workspaces to pass along data that is unique to a workflow and is needed for downstream jobs. Workflows that include jobs running on multiple branches may require data to be shared using workspaces. Workspaces are also useful for projects in which compiled data are used by test containers.
For example, a project with a
build job that builds a
.jar file and saves it to a workspace. The
build job fans-out into concurrently running test jobs:
code-coverage, each of which can have access to the jar by attaching the workspace.
Workspaces are additive-only data storage. Jobs can persist data to the workspace. When a workspace is used, data is archived and stored in an off-container store. With each addition to the workspace a new layer is created in the store. Downstream jobs can then attach the workspace to their container filesystem. Attaching the workspace downloads and unpacks each layer based on the ordering of the upstream jobs in the workflow.
Some notes about workspaces:
- Each workflow has a temporary workspace associated with it. The workspace can be used to pass along unique data built during a job to other jobs in the same workflow.
- Jobs can add files into the workspace using the
persist_to_workspacestep and download the workspace content into their file system using the
- The workspace is additive only, jobs may add files to the workspace but cannot delete files from the workspace.
- Each job can only see content added to the workspace by the jobs that are upstream of it.
- When attaching a workspace the “layer” from each upstream job is applied in the order the upstream jobs appear in the workflow graph. When two jobs run concurrently, the order in which their layers are applied is undefined.
- If multiple concurrent jobs persist the same filename then attaching the workspace will error.
- If a workflow is re-run it inherits the same workspace as the original workflow. When re-running failed jobs, only the re-run jobs will see the same workspace content as the jobs in the original workflow.
By default, workspace storage duration is set to 15 days. This can be customized on the CircleCI web app by navigating to Plan > Usage Controls. Currently, 15 days is also the maximum storage duration you can set.
To persist data from a job and make it available to other jobs, configure the job to use the
persist_to_workspace key. Files and directories named in the
paths: property of
persist_to_workspace will be uploaded to the workflow’s temporary workspace relative to the directory specified with the
root key. The files and directories are then uploaded and made available for subsequent jobs (and re-runs of the workflow) to use.
If you have custom storage settings,
persist_to_workspace will default to the customizations you have set for your workspaces. If none are set,
persist_to_workspace will be the default setting of 15 days.
Configure a job to get saved data by configuring the
attach_workspace key. The following
config.yml file defines two jobs where the
downstream job uses the artifact of the
flow job. The workflow configuration is sequential, so that
flow to finish before it can start.
For a live example of using workspaces to pass data between build and deploy jobs, see the
config.yml that is configured to build the CircleCI documentation.
For additional conceptual information on using workspaces, caching, and artifacts, refer to the Persisting Data in Workflows: When to Use Caching, Artifacts, and Workspaces blog post.
Workspace storage customization
When using self-hosted runners, there is a network and storage usage limit included in your plan. There are certain actions related to workspaces that will accrue network and storage usage. Once your usage exceeds your limit, charges will apply.
Retaining a workspace for a long period of time will have storage cost implications, therefore, it is best to determine why you are retaining workspaces. In most projects, the benefit of retaining a workspace is that you can re-run your build from fail. Once the build passes, the workspace is likely not needed. Setting a low storage retention for workspaces is recommended if this suits your needs.
You can customize storage usage retention periods for workspaces on the CircleCI web app by navigating to Plan > Usage Controls. For information on managing network and storage usage, see the Persisting Data page.
It is important to define paths and files when using
persist_to_workspace. Not doing so can cause a significant increase in storage. Specify paths and files using the following syntax:
- persist_to_workspace: root: /tmp/dir paths: - foo/bar - baz
Help make this document better
This guide, as well as the rest of our docs, are open source and available on GitHub. We welcome your contributions.
- Suggest an edit to this page (please read the contributing guidefirst).
- To report a problem in the documentation, or to submit feedback and comments, please open an issue on GitHub.
- CircleCI is always seeking ways to improve your experience with our platform. If you would like to share feedback, please join our research community.
Our support engineers are available to help with service issues, billing, or account related questions, and can help troubleshoot build configurations. Contact our support engineers by opening a ticket.
You can also visit our support site to find support articles, community forums, and training resources.
CircleCI Documentation by CircleCI is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.