AI DevelopmentLast Updated Jul 11, 202516 min read

Deploy and manage AI workloads on Scaleway infrastructure with CircleCI

Zan Markan

Developer Advocate

Jacob Rahme

Staff Software Engineer

With automation and CI/CD practices, the entire AI workflow can be run and monitored efficiently, often by just one person. Still, running AI/ML on GPU instances has its challenges. This tutorial shows you how to meet those challenges using the control and flexibility of CircleCI runners combined with Scaleway, a powerful cloud ecosystem for building, training, and deploying applications at scale. We will also demonstrate the cost effectiveness of ephemeral runners that consume only the resources required for AL/ML work.

Prerequisites

This tutorial builds on an MLOps pipeline example first introduced in our blog series on CI and CD for machine learning. We recommend you read through these first to gain a better understanding of the project you will be working with.

The repository is slightly changed from the previous examples to better work with the specific Scaleway cloud provider and Pulumi.

If you don’t have them yet, you will need to create CircleCI, Scaleway, and Pulumi accounts.

Scaleway offers 100 EUR credit for newly created accounts, which will come in handy when trying it out. This is also valid credit for certain GPU instances which the demo project uses. You will need to verify your identity and provide a payment method following their instructions.

For both CircleCI and Pulumi, you will be using the free tier.

For CircleCI, make sure you are an admin in the organization you are using for the project. Configuring new runner namespaces will require admin access.

Note: This is an advanced tutorial and not aimed at beginners, and assumes a level of familiarity with CircleCI, CI/CD concepts, and Infrastructure as Code.

High-level project flow

As the pipeline starts, first provision your environment and create a new runner resource class in CircleCI. This is how CircleCI will communicate with your Scaleway infrastructure.

Then use Pulumi to provision two GPU instances on Scaleway:

    1. Hosts your CircleCI runner and acts as your CI/CD agent for training and deploying the model.
    1. Acts as a model server, to which you will deploy your trained models.

Note: In real-world production scenarios, you would likely not provision the model serving instance from the same pipeline, as you would need it to be permanent rather than ephemeral.

After everything is provisioned, install the required dependencies. Train, test, and deploy your model, executed on your newly provisioned CircleCI runner. These jobs are thoroughly covered in the CI/CD for ML blog post series, so we won’t go into detail here.

Finally, the resources are cleaned up, and the newly created CircleCI runner is removed so the pipeline can run again.

Walkthrough and project setup

We recommend that you fork the sample repository and continue from there. You can also clone it directly using this command:

git clone https://github.com/CIRCLECI-GWP/circleci-deploy-ml-scaleway.git
cd circleci-deploy-ml-scaleway

If you have cloned the repository, make sure to save and push to GitHub as you will need to connect it to CircleCI. This guide will show you how to get started, and then walk you through the files that comprise the pipeline.

Preparing environment variables

You will need a number of secrets and environment variables set up before the pipeline can be run. The secrets are split logically into four contexts:

    1. Create a new CircleCI API key and store it in a CircleCI context named circleci-api as CIRCLECI_CLI_TOKEN. This provisions new runners from within the pipeline.
    1. Create a new Pulumi access token and store it in a context named pulumi as PULUMI_ACCESS_TOKEN.
    1. Create a Scaleway access key. You will need to generate two values: an access key ID and a secret key. Create a scaleway context and store them into SCW_ACCESS_KEY and SCW_SECRET_KEY, respectively.

Finally, create a context ml-scaleway-demo and populate it with the following environment variables::

  1. DEPLOY_SERVER_USERNAME (this tutorial uses demo)
  2. DEPLOY_SERVER_PASSWORD (this tutorial uses demodemo)
  3. DEPLOY_SERVER_PATH as /var/models
  4. MODEL_SERVER_PUBIC_KEY (your public SSH key, which you will use to access the model server instance)
  5. MODEL_SERVER_SSH_KEY (your private SSH key, which you will use to access the model server instance)

Note: You can generate a new SSH key (using the ssh-keygen command) or use an existing one. You will need to copy the public key and paste it into the Scaleway console later on.

Pulumi project and stack setup

Pulumi is the tool that helps you provision infrastructure. It offers an SDK-based approach to infrastructure provisioning for many different programming languages. This allows you to use Python, like in the rest of the AI/ML scripts.

In Pulumi you will need to create a new project and stack. Stack corresponds to individual applications in a Pulumi project. Your project is located in the org yemiwebby-org, and is named cci-ml-runner. It contains one stack, cci-runner-linux.

Files for Pulumi are located in pulumi. You may want to modify them with your preferred project and stack names, as well as your own Scaleway configuration.

Scaleway project setup

In Scaleway you created a new project named cci-ml-runner. Go to Project Settings, copy the Project ID (it should be in the UUID format) and paste it into the file pulumi/Pulumi.cci-runner-linux.yaml where you see scaleway:project_id: Leave the rest of the file unchanged — you’ll need this specific region and zone combination to use the GPU resources.

config:
  scaleway:project_id: YOUR_PROJECT_ID_UUID
  scaleway:region: fr-par
  scaleway:zone: fr-par-1

Make sure to pass in your SSH key to Scaleway using the SSH Keys section of the Scaleway console. This will allow you to SSH into the instances created by Pulumi later on.

Scaleway SSH keys

Setting up the CircleCI pipeline

From the CircleCI dashboard, search for and select your project. Click Set Up.

Set up project

You will be prompted to select a branch to run the pipeline from. Select the main branch, then click Set Up Project to trigger the pipeline using the .circleci/config.yml file in the repository. If all the prerequisites are met and the environment variables are set up correctly, the pipeline should start running.

CircleCI project pipeline

Your pipeline has one workflow of interest: build-deploy. It has 12 jobs, from provisioning the runner and infrastructure to the end.

To understand how the pipeline works, review the .circleci/config.yml file in detail. This file defines the jobs, workflows, and executors used in the pipeline.

The first job is provision_runner.

Provision runner

This job does most of the heavy lifting for provisioning cloud infrastructure and configuring the runner. It does it all in an automated way so that instances are truly ephemeral and consume only the resources required for the rest of the AI/ML work.

The job runs in a standard CircleCI Docker executor:

jobs:
  provision_runner:
    docker:
      - image: cimg/python:3.11

First, check out the code from the repository and write the model server public key to the ~/.ssh/id_rsa_modelserver.pub file. You will need this for SSH access later on in the tutorial.

- run:
    name: Write model server public key
    command: |
      mkdir -p ~/.ssh
      echo "$MODEL_SERVER_PUBLIC_KEY" > ~/.ssh/id_rsa_modelserver.pub

The CircleCI CLI will help you create new runner resource classes. The CLI uses the environment variable created earlier to authenticate.

Install the CircleCI CLI, which is used to interact with resource classes and tokens for self-hosted runners.

- run:
    name: Install CircleCI CLI
    command: |
      # Make CircleCI CLI available at /usr/local/bin/circleci
      curl -fLSs https://raw.githubusercontent.com/CircleCI-Public/circleci-cli/main/install.sh | sudo bash

You will also install Pulumi and the Scaleway Pulumi provider using pip. This enables the provisioning logic later in the pipeline.

- run:
    name: Install Pulumi & Scaleway SDK
    command: |
      python3 -m pip install pulumi pulumiverse_scaleway

Log into Pulumi using the CircleCI Pulumi orb:

- pulumi/login

Next, provision a new runner resource class and prepare the cloud-init file required for Pulumi to configure the virtual machine:

- run:
    name: Provision new runner and prepare cloud-init file
    command: |
      RESOURCE_CLASS="tutorial-gwp/scaleway-linux-${CIRCLE_WORKFLOW_ID}"

The RESOURCE_CLASS variable is defined using the CIRCLE_WORKFLOW_ID to ensure uniqueness across parallel or repeated workflows.

if circleci runner resource-class list tutorial-gwp --token "$CIRCLECI_CLI_TOKEN" | awk '{print $1}' | grep -Fxq "${RESOURCE_CLASS}"; then
echo " Resource class '${RESOURCE_CLASS}' already exists. Skipping creation."
else
echo "Creating resource class '${RESOURCE_CLASS}'..."
circleci runner resource-class create "${RESOURCE_CLASS}" \
"Autoprovisioned Linux runner on Scaleway"
fi

This checks whether the resource class already exists. If not, it creates one with a descriptive label.

runner_token_response=$(circleci runner token create "${RESOURCE_CLASS}" "${RESOURCE_CLASS##*/}" --token "$CIRCLECI_CLI_TOKEN")
runner_token=$(echo "$runner_token_response" | grep "token:" | awk '{print $2}')

Generate a new token for the runner resource class and extract the actual token value using awk.

cd pulumi
python3.11 -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install pulumi pulumiverse_scaleway

Inside the pulumi folder, create and activate a Python 3.11 virtual environment, then install all dependencies. This ensures Pulumi operations remain isolated and reproducible.

pulumi stack select yemiwebby-org/cci-ml-runner/cci-runner-linux
pulumi config set cci-ml-runner:circleciRunnerToken "$runner_token" --plaintext

Select the correct Pulumi stack and inject the runner token as a config variable.

sed "s/RUNNER_TOKEN/${runner_token}/g" runner_cloud_init_base.yml > runner_cloud_init.yml

Using sed, replace the placeholder RUNNER_TOKEN in the base cloud-init YAML file and write the output to a new file that Pulumi will use for provisioning.

Update the stack and apply all infrastructure changes:

- pulumi/update:
    stack: yemiwebby-org/cci-ml-runner/cci-runner-linux
    working_directory: pulumi

After provisioning, extract the model server’s IP and some environment variables into a .env file. Share them via CircleCI’s workspace system:

- run:
    name: Store model server IP to workspace
    command: |
      mkdir -p workspace
      echo "export DEPLOY_SERVER_HOSTNAME=$(pulumi stack output modelserver_ip  --cwd pulumi --stack  yemiwebby-org/cci-ml-runner/cci-runner-linux)" > workspace/.env
      echo "export DEPLOY_SERVER_USERNAME=root" >> workspace/.env
      echo "export DEPLOY_SERVER_PASSWORD=password" >> workspace/.env
      echo "export DEPLOY_SERVER_PATH=/var/models" >> workspace/.env
- persist_to_workspace:
    root: workspace
    paths:
      - .env

This ensures that any downstream jobs can load and use the IP address and log-in credentials generated during provisioning.

Provision runner

That’s it for the provision_runner job. The whole job looks like this:

jobs:
  provision_runner:
    docker:
      - image: cimg/python:3.11.9

    steps:
      - checkout
      - run:
          name: Write model server public key
          command: |
            mkdir -p ~/.ssh
            echo "$MODEL_SERVER_PUBLIC_KEY" > ~/.ssh/id_rsa_modelserver.pub

      - run:
          name: Install CircleCI CLI
          command: |
            # Make CircleCI CLI available at /usr/local/bin/circleci
            curl -fLSs https://raw.githubusercontent.com/CircleCI-Public/circleci-cli/main/install.sh | sudo bash

      - run:
          name: Install Pulumi & Scaleway SDK
          command: |
            python3 -m pip install pulumi pulumiverse_scaleway

      - pulumi/login

      - run:
          name: Provision new runner and prepare cloud-init file
          command: |
            RESOURCE_CLASS="tutorial-gwp/scaleway-linux-${CIRCLE_WORKFLOW_ID}"

            echo "Checking for existing resource class: ${RESOURCE_CLASS}"

            if [ -z "$RESOURCE_CLASS" ]; then
              echo " RESOURCE_CLASS is empty. Exiting."
              exit 1
            fi

            if circleci runner resource-class list tutorial-gwp --token "$CIRCLECI_CLI_TOKEN" | awk '{print $1}' | grep -Fxq "${RESOURCE_CLASS}"; then
              echo " Resource class '${RESOURCE_CLASS}' already exists. Skipping creation."
            else
              echo "Creating resource class '${RESOURCE_CLASS}'..."
              circleci runner resource-class create "${RESOURCE_CLASS}" \
                "Autoprovisioned Linux runner on Scaleway"
            fi

            echo "Generating new runner token..."
            runner_token_response=$(circleci runner token create "${RESOURCE_CLASS}" "${RESOURCE_CLASS##*/}" --token "$CIRCLECI_CLI_TOKEN")
            runner_token=$(echo "$runner_token_response" | grep "token:" | awk '{print $2}')

            if [ -z "$runner_token" ]; then
              echo "Failed to extract runner token. Exiting."
              exit 1
            fi

            echo "Runner token created: ${#runner_token} characters long"

            echo "Moving into Pulumi folder..."
            cd pulumi

            echo "Creating venv with Python 3.11 explicitly..."
            python3.11 -m venv venv
            source venv/bin/activate

            echo "Installing Pulumi SDK and dependencies...."
            pip install --upgrade pip
            pip install pulumi pulumiverse_scaleway

            echo "Selecting Pulumi stack..."
            pulumi stack select yemiwebby-org/cci-ml-runner/cci-runner-linux

            echo "Setting Pulumi config..."
            pulumi config set cci-ml-runner:circleciRunnerToken "$runner_token" --plaintext

            echo "Preparing cloud-init file..."
            sed "s/RUNNER_TOKEN/${runner_token}/g" runner_cloud_init_base.yml > runner_cloud_init.yml

      - pulumi/update:
          stack: yemiwebby-org/cci-ml-runner/cci-runner-linux
          working_directory: pulumi
      - run:
          name: Store model server IP to workspace
          command: |
            mkdir -p workspace
            echo "export DEPLOY_SERVER_HOSTNAME=$(pulumi stack output modelserver_ip  --cwd pulumi --stack  yemiwebby-org/cci-ml-runner/cci-runner-linux)" > workspace/.env
            echo "export DEPLOY_SERVER_USERNAME=root" >> workspace/.env
            echo "export DEPLOY_SERVER_PASSWORD=password" >> workspace/.env
            echo "export DEPLOY_SERVER_PATH=/var/models" >> workspace/.env
      - persist_to_workspace:
          root: workspace
          paths:
            - .env

Pulumi provisioning scripts and Scaleway GPU resource configuration

Pulumi scripts live in the pulumi directory of the project. This is where you configure the resources. The files of note are:

  • Pulumi.yaml: Project and language configuration for the SDK.
  • Pulumi.cci-runner-linux.yaml: Configuration specific to this project, such as Scaleway project ID, region, and zone. You populated that earlier with your project ID.
  • requirements.txt: Python’s dependency spec for Pulumi.
  • __main__.py: The main Pulumi script for declaring resource.
  • runner_cloud_init_base.yml: cloud-init template script for the CircleCI runner instance, executed at first boot to bootstrap it.
  • modelserver_cloud_init.yml: cloud-init script for the model server instance.

Scaleway instances

Provisioning resources with Pulumi

__main__.py provisions two Scaleway instances using the Server class from the pulumiverse_scaleway provider: one for the CircleCI runner (modelTrainingCCIRunner) and the other for serving models (tensorflowServer). Let’s walk through them starting with the runner:

modelTrainingCCIRunner = Server(
    "runnerServerLinux",
    zone=zone,
    type="GP1-XS",
    image="ubuntu_jammy",
    ip_id=runner_ip.id,
    root_volume={
        "size_in_gb": 80,
        "volume_type": "sbs_volume",
    },
    cloud_init=cloud_init_runner,
)

The instance is labeled runnerServerLinux, which is how it will appear in the Scaleway dashboard. This is the machine that will run your CircleCI jobs.

  • Type: The instance type is GP1-XS — a small GPU-based instance suitable for lightweight ML tasks.
  • Image: ubuntu_jammy is a clean Ubuntu 22.04 image.

  • IP Address: runner_ip.id, a reserved public IP, is attached.

  • Volume: An 80 GB SSD volume is attached using Scaleway’s newer volume type, sbs_volume.

  • Cloud-init: The cloud_init_runner script is read from file and injected with the runner token generated earlier in the pipeline.

The cloud_init_runner is prepared with the following snippet:

with open("runner_cloud_init_base.yml") as f:
    cloud_init_runner = f.read().replace("RUNNER_TOKEN", runner_token)
cloud_init_runner = f"""#cloud-config
{cloud_init_runner}
"""

The base cloud-init script is read and replaces a placeholder token before prepending the #cloud-config header (required). This enables the runner VM to register itself with CircleCI automatically upon boot.

Next, define the CPU-based server for serving TensorFlow models:

tensorflowServer = Server(
    "tensorflowServerLinux",
    zone=zone,
    type="DEV1-L",
    image="ubuntu_jammy",
    ip_id=server_ip.id,
    root_volume={
        "size_in_gb": 40,
        "volume_type": "sbs_volume",
    },
    cloud_init=cloud_init_modelserver,
)

This uses a more affordable CPU-based instance type (DEV1-L) and a smaller 40 GB SSD volume. This server will be used to serve models via Docker. Note that it’s possible to use a GPU instance here too, but Scaleway’s free plan allows only one GPU VM at a time, so this approach helps keep costs minimal for the tutorial.

The cloud_init_modelserver script does a few things:

  • Sets up a user named demo with SSH access and sudo rights.
  • Installs Docker and its dependencies.
  • Sets up folder structure under /var/models for staging and production models.

This cloud-init is constructed as follows:

with open("modelserver_cloud_init.yml") as f:
    cloud_init_modelserver = f.read()

with open(os.path.expanduser("~/.ssh/id_rsa_modelserver.pub")) as f:
    public_key = f.read().strip()

cloud_init_modelserver = f"""#cloud-config
users:
  - name: demo
    ...
    ssh-authorized-keys:
      - {public_key}
...
"""

Embed the local SSH public key into the authorized keys section of the cloud-init to enable login access for the demo user. This is useful for debugging or remote file management.

Finally, export key outputs for downstream use in the CircleCI workflow:

pulumi.export("cci_runner_ip", modelTrainingCCIRunner.public_ips)
pulumi.export("cci_runner_id", modelTrainingCCIRunner.id)
pulumi.export("modelserver_id", tensorflowServer.id)
pulumi.export("modelserver_ip", server_ip.address)

Note: modelTrainingCCIRunner.public_ips is used here because the new pulumiverse_scaleway provider returns a list of public IPs, while server_ip.address is explicitly pulled from the reserved IP resource for the model server.

These outputs are later picked up in the pipeline and written to .env for SSH and deployment purposes.

Cloud-init scripts for CircleCI runner server

Now, let’s look at the updated runner_cloud_init_base.yml, which bootstraps the GPU-based CircleCI runner instance with Docker and the CircleCI Machine Runner:

#!/bin/sh

export runner_token="RUNNER_TOKEN"

echo "Runner token $runner_token"

# Install Docker first
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) stable"
sudo apt update
sudo apt install -y docker.io python3-pip python3-venv

sudo systemctl enable docker
sudo systemctl start docker

# Now install CircleCI runner
curl -s https://packagecloud.io/install/repositories/circleci/runner/script.deb.sh?any=true | sudo bash
sudo apt install -y circleci-runner

# Give the circleci user docker access
sudo usermod -aG docker circleci

# Inject your runner token
sudo sed -i "s/RUNNER_TOKEN/$runner_token/g" /etc/circleci-runner/circleci-runner-config.yaml

# Enable & start the runner service
sudo systemctl enable circleci-runner
sudo systemctl start  circleci-runner

# Dump its status to the logs (non-blocking)
sudo systemctl status circleci-runner --no-pager || true

This shell script is executed automatically on first boot of the GPU instance to prepare it as a CircleCI Machine Runner.

Here’s what it does:

  1. Sets the runner token The RUNNER_TOKEN placeholder is dynamically injected from the Pulumi stack and exported to be used throughout the script.

  2. Installs Docker and dependencies Before installing the CircleCI runner, the script installs Docker and essential Python tooling (python3-pip, python3-venv). It also enables and starts the Docker service.

  3. Installs the CircleCI runner The script pulls the CircleCI runner APT source, then installs the circleci-runner package.

  4. Grants Docker access The circleci user is added to the docker group to allow jobs to run Docker commands if needed.

  5. Injects the runner token The runner token is inserted into the CircleCI config file located at /etc/circleci-runner/circleci-runner-config.yaml.

  6. Enables and starts the runner service The runner service is enabled and started via systemctl, making the instance immediately available to CircleCI pipelines.

  7. Prints runner status to logs (optional) Finally, the script logs the runner status for debugging purposes. The use of || true ensures the script doesn’t fail even if the status command exits non-zero.

Note: Keep the systemctl start circleci-runner instruction as the last step. Starting the runner early in the script might expose an incomplete setup to CircleCI, leading to flaky job behavior or missing dependencies.

This cloud-init file ensures the CircleCI runner is properly installed, secure, and Docker-ready, making the instance fully operational as soon as it boots.

Model server cloud-init script

Now, let’s look at the other cloud-init script: modelserver_cloud_init.yml.

This script installs Docker Engine, sets up the tensorflow-serving Docker container, and prepares your environment for pushing and deploying new models.

This also creates a new SSH user demo, which will be used by the pipeline to push models to the server.

The installation for Docker Engine is taken from this article on Scaleway’s tutorials site, if you want more details.

To prepare the serving and upload directories you need to first create /var/models and set ownership to the docker group:

# Create the directories and grant permissions so that the user defined in the .env file and docker can read/write to them
sudo mkdir -p /var/models/staging # so that docker will have something to bind to, it will be populated later
sudo mkdir -p /var/models/prod
sudo chown -R $USER:docker /var/models
sudo chmod -R 775 /var/models

Next, create the demo user that will be used for uploading models from the pipeline:

# Create demo user for SFTP upload if not exists
if ! id demo &>/dev/null; then
useradd -m -s /bin/bash demo
fi
usermod -aG docker demo

Finally, download the tensorflow_serving image and run the container:


# Download the TensorFlow Serving Docker image and repo
docker pull tensorflow/serving

# Run TensorFlow Serving container (if not already running)
if ! docker ps -a --format '{{.Names}}' | grep -w tensorflow_serving; then
  docker run -d --name tensorflow_serving -p 8501:8501 \
    -v /var/models/prod:/models/my_model \
    -e MODEL_NAME=my_model tensorflow/serving
fi

Your model server is ready for the newly trained models to be uploaded.

Using the new runner resource class to run AI/ML workloads in a CircleCI pipeline

After provisioning of cloud infrastructure and runner resources is complete, you can move on to the “fun stuff”. The subsequent jobs — install-build, train, test, package, deploy, and test-deployment — are based on a previous blog post. You can review that tutorial for more detail. This tutorial covers the differences between the two repositories that are specific to building on Scaleway’s GPU infrastructure.

In short, for each of the jobs, there is a corresponding Python script in the ml/ directory which has scripted the tasks for that segment of the pipeline: building the dataset, training, testing, and so on. Then, they use the CircleCI workspace to pass the intermediary artifacts between jobs in the pipeline.

In all the jobs, you will use the newly created executor default_linux_machine. For ease of reuse, it is declared it in the config.yml:

executors:
  default_linux_machine:
    machine:
      image: ubuntu-2204:current

All the jobs will declare the executor, which will then run it on your Scaleway instance.

install-build:
  executor: default_linux_machine
  steps: …

When the model has been trained and tested, it’s time to package it and deploy. You will use your model server instance for that, passing the details to the newly created instance’s IP via workspace.

You have already used the workspace to store the .env file containing DEPLOY_SERVER_HOSTNAME. You now need to retrieve it in the package job.

The populate-env command grabs it from the workspace, and makes it available to other commands in the job as an environment variable using $BASH_ENV file:

populate-env:
  steps:
    - attach_workspace:
        at: .
    - run:
        name: Restore secrets from workspace and add to environment vars
        # Environment variables must be configured in a CircleCI project or context
        command: |
          cat .env >> $BASH_ENV
          source $BASH_ENV

The ml/4_package.py script then takes the created model and uses SFTP to upload it to the model server’s staging directory.

This is performed again in deploy and test-deployment jobs, which also need access to the same IP address for their work.

Now, review the orchestrated build-deploy workflow with all the jobs:

workflows:
  # This workflow does a full build from scratch and deploys the model
  build-deploy:
    jobs:
      - provision_runner:
          context:
            - pulumi
            - scaleway
            - circleci-api
      - install-build:
          requires:
            - provision_runner
          context:
            - ml-scaleway-demo
      - train:
          requires:
            - install-build
      - test:
          requires:
            - train
      - package:
          requires:
            - test
          context:
            - ml-scaleway-demo
      # Do not deploy without manual approval - you can inspect the console output from training and make sure you are happy to deploy
      - deploy:
          requires:
            - package
          context:
            - ml-scaleway-demo
      - test-deployment:
          requires:
            - deploy
          context:
            - ml-scaleway-demo
      - approve_destroy:
          type: approval
          requires:
            - test-deployment
      - destroy_runner:
          context:
            - pulumi
            - scaleway
            - circleci-api
          requires:
            - approve_destroy

The workflow definition orchestrates all the jobs in your pipeline, depending on what the conditions for execution are. You can also pass in the contexts with the right environment variables.

For example, your provision_runner job needs access to multiple environment variables to access Pulumi, Scaleway, and the CircleCI API keys, so you pass all three to it.

jobs:
  - provision_runner:
      context:
        - pulumi
        - scaleway
        - circleci-api

Similarly, the deploy job needs access to the deployment server credentials and path, which are stored in the ml-scaleway-demo context:

- deploy:
    requires:
      - package
    context:
      - ml-scaleway-demo

The final jobs in the workflow are approve_destroy, which introduces a manual approval before you clean up your created infrastructure in destroy_runner:

- approve_destroy:
         type: approval
         requires:
           - test-deployment
     - destroy_runner:
         context:
           - pulumi
           - scaleway
           - circleci-api
         requires:
           - approve_destroy

Destroy runner and clean up the environment

The destroy_runner job is responsible for tearing down the infrastructure provisioned earlier in the pipeline, specifically the CircleCI self-hosted runner and associated resources (VM, public IP, and volumes). This is important for keeping costs low and ensuring that resources are ephemeral after use.

The job runs in a Docker environment using Python 3.11.9:

jobs:
  destroy_runner:
    docker:
      - image: cimg/python:3.11.9

Check out the project repository:

- checkout

Next, install the CircleCI CLI so you can interact with CircleCI resource classes or tokens if needed (not strictly used here but kept for consistency and future extensibility):

- run:
    name: Install CircleCI CLI
    command: |
      curl -fLSs https://raw.githubusercontent.com/CircleCI-Public/circleci-cli/main/install.sh | sudo bash

Pulumi is then authenticated using the provided API token via the pulumi/login command. This allows the job to run Pulumi operations securely:

- pulumi/login

Finally, destroy the Pulumi stack associated with your infrastructure:

- pulumi/destroy:
    stack: yemiwebby-org/cci-ml-runner/cci-runner-linux
    working_directory: pulumi

This command runs pulumi destroy on the specified stack inside the pulumi folder. It deletes all cloud resources (e.g., Scaleway servers, IPs, volumes) defined in the __main__.py file, effectively cleaning up the environment.

Destroy runner and clean up

Conclusion

This wraps up your tutorial. We covered the intricacies of using CircleCI runners to execute AI/ML workloads on your own infrastructure. In your example we used Scaleway cloud and its optimized GPU compute instances, but the principles covered can be applied to any type of infrastructure. We also covered utilizing the flexibility that the cloud offers to use resources only when needed. That is common practice for a CI/CD pipeline with a clearly defined beginning and end, but it is not as common with more traditional infrastructure. By leveraging as-needed cloud resources, CircleCI can help you manage your AI/ML workflows more efficiently.

CircleCI is the leading CI/CD platform for managing and automating AI workflows. It takes minutes to set up, and you can evaluate it at no cost, with up to 6,000 minutes of free usage per month. Sign up for your free account to get started today.