TutorialsJul 24, 202411 min read

Build and deploy a Dockerized OpenCV application on AWS Lambda

Vivek Maskara

Software Engineer

Developer D sits at a desk working on an intermediate-level project.

OpenCV is a powerful open source computer vision and machine learning software library for real-time applications. It provides multiple functionalities, including image processing, object detection, and video analysis, making it a fundamental tool for researchers and developers in the field of computer vision.

You can deploy your OpenCV applications using AWS Lambda, a serverless compute service that lets you run code without provisioning or managing servers. The traditional approach for deploying Lambda functions using ZIP archive might not be suitable for OpenCV applications due to the size limitations. You can instead Dockerize your OpenCV application and deploy the Lambda as a container image.

This tutorial walks you through the steps for creating an OpenCV function to convert a color image to a grayscale image. You will learn how to containerize the application with Docker and to deploy your container image to AWS Lambda. You will also learn how to automate your deployments with CircleCI for faster, more reliable updates.

Prerequisites

To complete this tutorial, you will need to set up a Python development environment on your machine. You will also need a CircleCI account to automate the testing of the OpenCV application and the REST API. Refer to this list to set up everything required for this tutorial.

Creating a new Python project

First, create a new directory for your Python project and navigate into it.

mkdir opencv-docker-aws-lambda-circleci
cd opencv-docker-aws-lambda-circleci

Installing dependencies

In this tutorial, we will use the opencv-python-headless Python package for the OpenCV functions and Flask for exposing the functionality as a REST API. We will also use the requests package for network calls.

Create a requirements.txt file in the project’s root and add the following content to it:

opencv-python-headless
boto3
simplejson
requests

To install the dependencies, use the pip install command (in your terminal):

pip install -r requirements.txt

Note: You will probably want to use a virtual environment.

Defining the OpenCV script

OpenCV offers various tools and utilities for image processing. In this tutorial, we will use the cvtColor method to convert a color image (RGB image) to a grayscale image. Our script will accept an image URL, download the image to a temp directory, use OpenCV for image processing, and upload the output image to an AWS S3 bucket.

Create a process_task.py file in the opencv-docker-aws-lambda-circleci directory and add utility functions for downloading an image from a URL and uploading it to S3.

import requests
import boto3
import os
import cv2

tmp_dir = "/tmp"
S3_BUCKET_NAME = os.environ["S3_BUCKET_NAME"]

def download_image(image_url):
    try:
        headers = {
            "User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0",
            "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
            "Accept-Language": "en-US,en;q=0.9",
        }
        # download image from the internet
        response = requests.get(image_url, headers=headers)
        image = response.content

        # save the image to a temporary directory
        with open(f"{tmp_dir}/image.jpg", "wb") as f:
            f.write(image)
            print("Image downloaded successfully")

        return f"{tmp_dir}/image.jpg"

    except Exception as e:
        print("Error: ", e)
        raise e

def upload_image_to_s3(image_path):
    s3 = boto3.client("s3")

    s3_file_path = f"gray_images/{os.path.basename(image_path)}"

    with open(image_path, "rb") as f:
        s3.upload_fileobj(f, S3_BUCKET_NAME, s3_file_path)

    s3_file_url = f"https://{S3_BUCKET_NAME}.s3.amazonaws.com/{s3_file_path}"
    return s3_file_url

Let’s go over the above code snippet:

  • The download_image() function uses the requests library to fetch and save the image to a temporary file.
  • upload_image_to_s3() accepts a local image path and uploads the image to the AWS S3 bucket.

Next, define a function in the process_task.py file to convert a color image to grayscale.

# upload imports to add cv2
import cv2

def convert_image_to_grayscale(image_url):
    tmp_img_path = download_image(image_url)
    output_img_path = f"{tmp_dir}/gray_image.jpg"

    image = cv2.imread(tmp_img_path)
    # convert the image to grayscale
    gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

    # save the grayscale image to a temporary directory
    cv2.imwrite(output_img_path, gray_image)

    # upload the grayscale image to S3
    s3_file_path = upload_image_to_s3(output_img_path)
    return s3_file_path

In the above code snippet:

  • convert_image_to_grayscale() uses download_image() to download the image to a temp path.
  • The image is read using OpenCV’s imread() and converted to grayscale using cvtColor().
  • Finally, the output gray_image is persisted to a local file, and upload_image_to_s3() uploads it to AWS S3.

Defining the Lambda function handler

Next, you need to define an AWS Lambda function handler that will be called when anyone invokes the Lambda. Create an app.py file in the project’s root and add the following contents:

import json
from process_task import convert_image_to_grayscale

def lambda_handler(event, context):
    try:
        request_body = json.loads(event['body'])
        image_url = request_body['imageUrl']

        gray_image_path = convert_image_to_grayscale(image_url)

        return {
            'statusCode': 200,
            'body': json.dumps({
                'grayImagePath': gray_image_path
            })
        }

    except Exception as e:
        print("Error: ", e)
        return {
            'statusCode': 500,
            'body': json.dumps({
                'error': str(e)
            })
        }

if __name__ == "__main__":
    import os

    event = {
        'body': json.dumps({
            'imageUrl': 'https://picsum.photos/id/237/536/354'
        })
    }

    context = {}

    lambda_handler(event, context)

The lambda_handler() accepts a JSON payload and retrieves the imageUrl. It calls convert_image_to_grayscale() to convert the image to grayscale and returns the output image path.

Defining the Dockerfile

There are multiple approaches for building a Python Lambda container image. This tutorial will use a non-AWS base image to deploy the OpenCV application. The primary advantage of using a non-AWS image is its flexibility. You can customize the available packages and harden the image to match your organization’s security policies.

To keep the container image slim, we will use a multi-stage Docker build. In the first stage, we will do the following:

  • Install the system packages required for building OpenCV’s headless version.
  • Install all the Python packages needed by our application.
  • Install awslambdaric runtime for Lambda.

In the final stage, we will do the following:

  • Copy the dependencies from the first stage to our runtime image
  • Configure the container’s entry point.

To start, create a Dockerfile in the project’s root and add the following:

# Define custom function directory
ARG FUNCTION_DIR="/function"

FROM python:3.9.18-slim-bullseye as build-image

# Include global arg in this stage of the build
ARG FUNCTION_DIR

# Install aws-lambda-cpp build dependencies
RUN apt-get update && \
  apt-get install -y \
  g++ \
  make \
  cmake \
  unzip \
  libcurl4-openssl-dev

RUN apt-get install -y --fix-missing \
    build-essential \
    cmake \
    gfortran \
    git \
    wget \
    curl \
    ffmpeg \
    libsm6 \
    libxext6 \
    graphicsmagick \
    libgraphicsmagick1-dev \
    libatlas-base-dev \
    libavcodec-dev \
    libavformat-dev \
    libgtk2.0-dev \
    libjpeg-dev \
    liblapack-dev \
    libswscale-dev \
    pkg-config \
    python3-dev \
    python3-numpy \
    software-properties-common \
    zip \
    && apt-get clean && rm -rf /tmp/* /var/tmp/*

# Copy function code
RUN mkdir -p ${FUNCTION_DIR}
COPY . ${FUNCTION_DIR}

WORKDIR ${FUNCTION_DIR}

RUN pip install --upgrade pip

RUN pip install -r requirements.txt --target ${FUNCTION_DIR}

# Install the function's dependencies
RUN pip install \
    --target ${FUNCTION_DIR} \
        awslambdaric

FROM python:3.9.18-slim-bullseye as runtime-image

# Include global arg in this stage of the build
ARG FUNCTION_DIR
# Set working directory to function root directory
WORKDIR ${FUNCTION_DIR}

ENV NUMBA_CACHE_DIR=/tmp

# Copy in the built dependencies
COPY --from=build-image ${FUNCTION_DIR} ${FUNCTION_DIR}

FROM runtime-image

COPY . ${FUNCTION_DIR}

ARG FUNCTION_DIR
# Set working directory to function root directory
WORKDIR ${FUNCTION_DIR}

ENV NUMBA_CACHE_DIR=/tmp
ENV MPLCONFIGDIR=/tmp

# Turn on Graviton2 optimization
ENV DNNL_DEFAULT_FPMATH_MODE=BF16
ENV LRU_CACHE_CAPACITY=1024

ENTRYPOINT [ "/usr/local/bin/python", "-m", "awslambdaric" ]
CMD [ "app.lambda_handler" ]

The multi-stage Docker build ensures that the runtime image contains only the required packages and dependencies. For example, the Linux packages that you installed to let pip install build and compile the Python dependencies are not part of the runtime Lambda. Those packages were required only in the compilation phase.

Building and testing the container locally

In this section, you will learn how to emulate the AWS Lambda runtime on your local machine and use it to test the container image.

Enable local testing

You can use the AWS Lambda Runtime Interface Emulator to test the AWS Lambda function locally. You can read more about its internals in this guide.

First, create an entry.sh file in the project’s root and add the following contents to it:

#!/bin/sh
if [ -z "${AWS_LAMBDA_RUNTIME_API}" ]; then
  exec /usr/local/bin/aws-lambda-rie /usr/local/bin/python -m awslambdaric $@
else
  exec /usr/local/bin/python -m awslambdaric $@
fi

The code snippet above does the following:

  • If AWS_LAMBDA_RUNTIME_API is set to true, it indicates that the code is running locally.
  • If running locally, the script uses the AWS Lambda Runtime Interface Emulator (aws-lambda-rie) to emulate the Lambda environment.
  • If the code is running on AWS Lambda, the script directly runs the AWS Lambda Runtime Interface Client (RIC) with Python.

Next, follow the instructions to install the aws-lambda-rie emulator for your platform. Be sure to move the aws-lambda-rie to the directory containing the Dockerfile, which is the root folder for the project.

Finally, update the Dockerfile to modify the ENTRYPOINT.

# replace the ENTRYPOINT and CMD with the code below
COPY ./entry.sh /entry.sh
RUN chmod +x /entry.sh
ADD aws-lambda-rie /usr/local/bin/aws-lambda-rie

ENTRYPOINT [ "/entry.sh", "app.lambda_handler" ]

Building the container image

Execute the following command to use the container image:

docker build --platform linux/amd64 -t opencv-grayscale-image:1.0.0 .

Note: The --platform parameter is optional but can be useful on devices with Apple Silicon chips.

Running the container

Before running the Docker container, create an .env file and add the AWS S3 bucket name to it:

S3_BUCKET_NAME=<your-bucket-name>

You will need the .env file to pass runtime environment variables to the Docker container.

Execute the docker run command to run the Docker image:

docker run --platform linux/amd64 -e AWS_ACCESS_KEY_ID='<YOUR_KEY>' \
-e AWS_SECRET_ACCESS_KEY='<YOUR_ACCESS_KEY>' \
-e AWS_DEFAULT_REGION='us-west-2' \
--env-file .env \
-p 9000:8080 opencv-grayscale-image:1.0.0

Make sure to replace the values for AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY with your credentials.

Notice that the -p command maps the Docker container’s port 8080 to your device’s port 9000. To test if the Lambda function works as expected, you can use the following curl command:

curl "http://localhost:9000/2015-03-31/functions/function/invocations" \
-d '{"body": "{\"imageUrl\":\"https://picsum.photos/id/237/200/300\"}" }'

Note: if you get an error like this:

curl: (56) Recv failure: Connection reset by peer

Check that port 9000 is not being used by another service.

You can replace the image URL with a different image. The output of the command would look something like this:

{"statusCode": 200, "body": "{\"grayImagePath\": \"https://open-cv-circleci-tutorial.s3.amazonaws.com/gray_images/gray_image.jpg\"}"}

Pushing the container image to AWS ECR

Now that you have tested that your function works locally, you can deploy it to your AWS account. To deploy the function, you first need to push the image to a remote container repository. In this tutorial, we will use the AWS ECR repository to host the container image.

Authenticate with AWS ECR

Use the AWS CLI to retrieve an authentication token and authenticate your Docker client to your registry using the following command:

aws ecr get-login-password --region <AWS_REGION> | docker login --username AWS --password-stdin <ACCOUNT_ID>.dkr.ecr.<AWS_REGION>.amazonaws.com

You can find the ACCOUNT_ID using the AWS UI console. Also make sure that your AWS credentials are set up correctly by running export AWS_ACCESS_KEY_ID=<your_access_key> and export AWS_SECRET_ACCESS_KEY=<your_secrete_key>.

Once you are successfully authenticated, you will see a Login Succeeded message. Then you can proceed to the next step.

Tag and push the image

Next, use the following command to tag the image that you built in the previous section:

docker tag opencv-grayscale-image:1.0.0 <ACCOUNT_ID>.dkr.ecr.<AWS_REGION>.amazonaws.com/<ECR_REPO_NAME>:1.0.0

Make sure to replace ECR_REPO_NAME with the name of an existing AWS ECR repo in your AWS account.

Finally, push the image to AWS ECR using the following command:

docker push <ACCOUNT_ID>.dkr.ecr.<AWS_REGION>.amazonaws.com/<ECR_REPO_NAME>:1.0.0

Deploying the AWS Lambda function

Now that the container image has been built and pushed to AWS ECR, you are ready to deploy the Lambda function. We will use the serverless framework to do this. Create a serverless.yml in the project’s root and add the following:

service: opencv-grayscale-lambda

frameworkVersion: "3"

provider:
  name: aws
  region: us-west-2
  runtime: python3.9
  iam:
    role:
      statements:
        - Effect: "Allow"
          Action:
            - "s3:*"
          Resource: ["arn:aws:s3:::${param:bucket_name}", "arn:aws:s3:::${param:bucket_name}/*"]

functions:
  OpenCVGrayscaleLambdaV2:
    image: ${param:account_id}.dkr.ecr.${param:region}.amazonaws.com/${param:repo_name}@${param:image_sha}
    timeout: 60
    memorySize: 1024
    environment:
      S3_BUCKET_NAME: ${param:bucket_name}

In this YAML code snippet:

  • You define an AWS Lambda function named OpenCVGrayscaleLambdaV2() and use the AWS ECR image as the Lambda execution code.
  • You also timeout, memorySize, and environment variables for the Lambda function.

To deploy the Lambda function, you can execute the serverless deploy command as follows:

serverless deploy \
--param="bucket_name=${S3_BUCKET_NAME}"\
--param="account_url=${AWS_ECR_ACCOUNT_URL}"\
--param="repo_name=${ECR_REPO_NAME}"\
--param="image_sha=${IMAGE_DIGEST}"\
--param="region=${AWS_DEFAULT_REGION}"\
--param="account_id=${AWS_ACCOUNT_ID}"

Make sure that the environment variables needed by the command are set before you execute it:

  • S3_BUCKET_NAME corresponds to the name of the AWS S3 bucket used for hosting the output image.
  • AWS_ECR_ACCOUNT_URL is of the form <ACCOUNT_ID>.dkr.ecr.<AWS_REGION>.amazonaws.com.
  • ECR_REPO_NAME is the name of the AWS ECR repo.
  • IMAGE_DIGEST refers to the SHA of the container image. You can find the image SHA using the AWS UI console.
  • AWS_DEFAULT_REGION is the region for your AWS resources.
  • AWS_ACCOUNT_ID is the AWS account ID.

Once you execute the above command, it will create a new AWS Lambda function in your AWS account.

Automating deployments with CircleCI

Now that you have deployed the application to AWS Lambda using the CLI, you can automate future deployments using CircleCI for continuous deployment. The CircleCI workflow will build the container image, push it to AWS ECR, and deploy a new version of the AWS Lambda function. If you haven’t already signed up for your free CircleCI account, be sure to do that now.

Add the configuration script

First, add a .circleci/config.yaml script in the project’s root containing the configuration file for the CI pipeline. Add this code snippet to it.

version: 2.1

orbs:
  aws-ecr: circleci/aws-ecr@9.0.4
  aws-cli: circleci/aws-cli@4.1.3
  serverless: circleci/serverless-framework@2.0.1

jobs:
  build_and_push_to_ecr:
    machine:
      image: ubuntu-2204:current
    steps:
      - checkout
      - aws-ecr/build_and_push_image:
          auth:
            - aws-cli/setup:
                aws_access_key_id: "${AWS_ACCESS_KEY_ID}"
                aws_secret_access_key: "${AWS_SECRET_ACCESS_KEY}"
                region: "${AWS_DEFAULT_REGION}"
          create_repo: false
          dockerfile: Dockerfile
          platform: linux/amd64
          push_image: false
          region: "${AWS_DEFAULT_REGION}"
          repo: "${AWS_ECR_REPO_NAME}"
          tag: "${CIRCLE_SHA1}"
          workspace_root: .
      - aws-ecr/push_image:
          region: "${AWS_DEFAULT_REGION}"
          repo: "${AWS_ECR_REPO_NAME}"
          tag: "${CIRCLE_SHA1}"
  deploy_using_serverless:
    executor: serverless/default
    steps:
      - checkout
      - aws-cli/setup
      - serverless/setup
      - run:
          name: Get ECR Image Tag and Deploy Lambda
          command: |
            IMAGE_DIGEST=$(aws ecr describe-images --repository-name ${AWS_ECR_REPO_NAME} --query 'sort_by(imageDetails,& imagePushedAt)[-1].imageDigest')

            IMAGE_DIGEST=$(echo $IMAGE_DIGEST | sed -e 's/^"//' -e 's/"$//')

            serverless deploy --param="bucket_name=${BUCKET_NAME}" --param="account_id=${AWS_ACCOUNT_ID}" --param="region=${AWS_DEFAULT_REGION}" --param="repo_name=${AWS_ECR_REPO_NAME}" --param="image_sha=${IMAGE_DIGEST}"

workflows:
  deploy_lambda:
    jobs:
      - build_and_push_to_ecr
      - deploy_using_serverless:
          requires:
            - build_and_push_to_ecr

Take a moment to review the CircleCI configuration:

  • The config defines the build_and_push_to_ecr job uses circleci/aws-ecr orb.
  • The build_and_push_to_ecr job builds the Docker image and pushes the container image to AWS ECR. It tags the container image with a commit hash (CIRCLE_SHA1).
  • The deploy_using_serverless job is dependent on the success of build_and_push_to_ecr job. It uses the circleci/serverless-framework orb to deploy the Lambda function.

Now that the configuration file has been properly set up, create a repository for the project on GitHub and push all the code to it. Review Pushing a project to GitHub for instructions.

Setting up the project on CircleCI

Next, log in to your CircleCI account. On the CircleCI dashboard, click the Projects tab, search for the GitHub repo name, and click Set Up Project for your project.

Setup project on CircleCI

You will be prompted to add a new configuration file manually or use an existing one. Since you have already pushed the required configuration file to the codebase, select the Fastest option and enter the name of the branch hosting your configuration file. Click Set Up Project to continue.

Configure project on CircleCI

Completing the setup will trigger the pipeline, but it will fail since the environment variables are not set. Let’s do that next.

Set environment variables

On the project page, click on Project settings and head over to the Environment variables tab. On the screen that appears, click on Add environment variable button and add the following environment variables.

  • AWS_ACCESS_KEY_ID to the access key obtained while configuring programmatic access for AWS.
  • AWS_SECRET_ACCESS_KEY to the access secret obtained while configuring programmatic access for AWS.
  • AWS_ACCOUNT_ID to the AWS account ID. You can find this value using the AWS UI console.
  • AWS_DEFAULT_REGION to the region for your AWS resources.
  • AWS_ECR_REPO_NAME to the name of the AWS ECR repo you created for hosting the container images.
  • BUCKET_NAME to the name of the AWS S3 bucket for hosting the output images.
  • AWS_ECR_ACCOUNT_URL to the AWS ECR account URL. It is of the form <ACCOUNT_ID>.dkr.ecr.<AWS_REGION>.amazonaws.com.

Once you add the environment variables, it should show the key values on the dashboard.

Set environment variables on CircleCI

Now that the environment variables are configured, trigger the pipeline again. This time the build should succeed.

Successful build on CircleCI

Now, any time you make changes to your code and push them to your repository, CircleCI will automatically trigger the pipeline to build your Docker image, push it to Amazon ECR, and deploy your application using the Serverless framework.

Conclusion

In this tutorial, you learned how to automatically build and deploy a containerized OpenCV application to AWS Lambda using CircleCI. OpenCV reduces the complexity of working with image processing applications. AWS Lambda is a serverless platform for deploying Python applications with minimal effort and can be used to scale your application based on the load.

With CircleCI, you can automate the build, test, and deployment pipeline for continuous integration and continuous deployment (CI/CD). The pipeline can be used to build the Docker image, push it to a remote container registry, and deploy the AWS Lambda function.

You can check out the complete source code used in this tutorial on GitHub.

Copy to clipboard