TutorialsMar 20, 202215 min read

Schedule database backups for MongoDB in a Node.js application

Olususi Oluyemi

Fullstack Developer and Tech Author

Developer C sits at a desk working on an intermediate-level project.

Database backup protects your data by creating a copy of your database locally, or remotely on a backup server. This operation is often performed manually by database administrators. Like every other human-dependent activity, it is susceptible to errors and requires lots of time.

Regularly scheduled backups go a long way to safeguarding your customers’ details in the case of operating system failure or security breach. In this tutorial, I will guide you through the process of scheduling a backup version of your application’s database at a defined regular interval using scheduled pipelines.

To get a good grasp of the process of automated database backup operation, we will set up a database backup for a Node.js application with a MongoDB database. This application will be deployed on Heroku using the deployment pipeline on CircleCI. The MongoDB database will be hosted on MongoDB Atlas; a multi-cloud platform for database hosting and deployment.

For easy access, the generated, backed-up MongoDB collection for our application will be stored on Microsoft Azure Storage.

Prerequisites

Here is what you need to follow this tutorial successfully:

Cloning the demo application

To get started, run this command to clone the demo application:

git clone https://github.com/yemiwebby/db-cleanup-starter.git db-back-up-schedule

Next, move into the newly cloned app and install all its dependencies:

cd db-back-up-schedule
npm install

This application contains these endpoints:

  • new-company creates a new company by specifying the name of the company and its founder.
  • companies retrieves the list of companies from the database.

When the installation process has finished, create a .env file and populate it with this:

MONGODB_URI=YOUR_MONGODB_URL

If you would rather, you can simply run this command to copy the content from .env.sample file within the starter project:

cp .env.sample .env

Of course, you need to replace the YOUR_MONGODB_URL placeholder with the connection string you used for your remote MongoDB URI. This tutorial uses MongoDB Atlas database and you can easily set one up. I will explain how to do that next.

Creating a MongoDB Atlas account and database

Create a free Atlas account here and follow the instructions to deploy a free tier cluster. Once you have a cluster and database user set up, open and edit the .env file.

Replace the YOUR_MONGODB_URL placeholder with the extracted connection string from your MongoDB Atlas dashboard:

MONGODB_URI=mongodb+srv://<username>:<password>@<clustername>.mongodb.net/<dbname>?retryWrites=true&w=majority

Replace the <username>, <password>, <clustername> and <dbname> with the values for your cluster.

Running the demo application

When the database is properly created and configured, open a terminal and run the demo application with:

npm run start

You will get this output:


> db-back-up-schedule@1.0.0 start
> node server.js

Server is running at port 3000
Connected successfully

Creating a company

Test the demo application by creating new company details. Open up Postman or your preferred API testing tool. Send a POST request to the http://localhost:3000/new-company endpoint using this JSON data:

{
  "name": "Facebook",
  "founder": "Mark"
}

Create new company

Viewing the list of companies

Next, send a GET request to http://localhost:3000/companies to retrieve that list of companies.

View companies

Creating an application on Heroku

Next, create a new application on Heroku to host and run the Node.js project. Go to the Heroku dashboard to begin. Click New and then New App. Fill in the form with a name for your application and your region.

Note: Application names on Heroku are unique. Pick one that is available and make a note of it.

Create new Heroku App

Click the Create app button. You will be redirected to the Deploy view of your newly created application.

Next, create a configuration variable to reference the MongoDB URI that was extracted from the MongoDB Atlas dashboard earlier. To do that, navigate to the Settings page, scroll down, and click the Reveal Config Vars button.

Reveal Heroku Config

Specify the key and value as shown here, and click Add once you are done.

Add Mongo URI

Lastly, you need to retrieve the API key for your Heroku account. This key will be used to connect your CircleCI pipeline to Heroku. To get your API key, open the Account Settings page.

Scroll to the API keys section.

Get API Key

Click the Reveal button and copy the API key. Save it somewhere you can easily find it later.

Creating an Azure storage account

As mentioned, Microsoft Azure storage will be used to host the backed-up MongoDB collection for our database. To do this, you need to sign up for a free account on Azure. Then go to your Azure portal dashboard.

Create Azure storage

Click Storage accounts from the list of services or use the search feature by typing “storage” in the search bar.

Search storage

From the storage account page, click Create. On this new page, specify the details for your storage account.

Create storage details

Next, do this:

  1. Select a subscription.
  2. Select an existing resource group or create a new one.
  3. Enter a storage account name. For this tutorial, I named mine dbblobs.
  4. Select a region closer to you.

Click Review + Create, then click the Create button. Your storage account will be created and deployed.

New storage account deployed

It is worth mentioning that an Azure blob storage account offers these resources:

  • The storage account that was just been created.
  • Container, which helps organize a set of blobs, similar to a directory in a file system.
  • blob, which is usually stored in a container similar to files stored in a directory.

At this point, you have a functioning storage account. The next thing to do is create a container to house your blobs (MongoDB collection backup, in our case). On your new storage account page, click Containers from the side menu bar. Then click + Container to create a new container.

Create new container

Give your container a name and change the public access level. Click the Create button once you are done.

Retrieving the access key

To easily establish a remote connection to either store or retrieve files from your storage account, you need an access key. By default, each storage account on Azure comes with two different access keys, which allows you to replace one while using the other. To reveal your keys, click Show keys and copy one of the keys, preferably the first one.

Retrieve access key

Paste the key somewhere safe on your computer; you will need it later.

Adding the pipeline configuration script

Next, you wneed to add the pipeline configuration for CircleCI. The pipeline will consist of steps to install the project’s dependencies and compile the application for production.

At the root of your project, create a folder named .circleci. In that folder, create a file named config.yml. In the newly created file, add this configuration:

version: 2.1
orbs:
  heroku: circleci/heroku@1.2.6
jobs:
  build:
    executor: heroku/default
    steps:
      - checkout
      - heroku/install
      - heroku/deploy-via-git:
          force: true
workflows:
  deploy:
    jobs:
      - build

This configuration pulls in the Heroku orb circleci/heroku, which automatically provides access to a robust set of Heroku jobs and commands. One of those jobs is heroku/deploy-via-git, which deploys your application straight from your GitHub repo to your Heroku account.

Next, set up a repository on GitHub and link the project to CircleCI. Review Pushing a project to GitHub for step-by-step instructions.

Log in to your CircleCI account. If you signed up with your GitHub account, all your repositories will be available on your project’s dashboard.

Click Set Up Project for your db-clean-up project.

Set up project

You will be prompted with a couple of options for the configuration file. Select the use the .circleci/config.yml in my repo option. Enter the name of the branch where your code is housed on GitHub, then click the Set Up Project button.

Select Configuration

Your first workflow will start running, but it will fail. This is because you have not provided your Heroku API key. You can fix that now.

Click the Project Settings button, then click Environment Variables. Add these two new variables:

  • HEROKU_APP_NAME is the app name in Heroku (db-clean-up)
  • HEROKU_API_KEY is the Heroku API key that you retrieved from the account settings page

Select Rerun Workflow from Failed to rerun the Heroku deployment. This time, your workflow will run successfully.

To confirm that your workflow was successful, you can open your newly deployed app in your browser. The URL for your application should be in this format https://<HEROKU_APP_NAME>.herokuapp.com/.

View companies list on Heroku

Here is a quick recap of what you have done and learned so far. You have:

  • Created working application locally
  • Created a functioning application on Heroku
  • Created a Microsoft Azure blob storage account
  • Successfully set up a pipeline to automate the deployment of your application to Heroku using CircleCI

Generating and uploading the backup file

MongoDB stores data records as documents; specifically BSON documents, which are gathered together in collections.

In this section, you will create a script to generate the database backup file (BSON document) for your project and also upload the file to Microsoft Azure. To do this, we will use two different tools:

  • [mongodump](https://docs.mongodb.com/database-tools/mongodump/) works by running a simple command. It is a utility tool that can be used for creating a binary export of the contents of a database. MongoDump tool is part of the MongoDB Database Tools package and will be installed once you deploy your application on CircleCI.
  • [Azure Storage Blob](https://www.npmjs.com/package/@azure/storage-blob) is a JavaScript library that makes it easy to consume the Microsoft Azure Storage blob service from a Node.js application. This library has already been included and installed as a dependency for our project in this tutorial.

To generate and upload the backup file, create a new file named backup.js at the root of the application and use this content for it:

require("dotenv").config();
const exec = require("child_process").exec;
const path = require("path");
const {
  BlobServiceClient,
  StorageSharedKeyCredential,
} = require("@azure/storage-blob");

const backupDirPath = path.join(__dirname, "database-backup");

const storeFileOnAzure = async (file) => {
  const account = process.env.ACCOUNT_NAME;
  const accountKey = process.env.ACCOUNT_KEY;
  const containerName = "dbsnapshots";

  const sharedKeyCredential = new StorageSharedKeyCredential(
    account,
    accountKey
  );

  // instantiate Client
  const blobServiceClient = new BlobServiceClient(
    `https://${account}.blob.core.windows.net`,
    sharedKeyCredential
  );

  const container = blobServiceClient.getContainerClient(containerName);
  const blobName = "companies.bson";
  const blockBlobClient = container.getBlockBlobClient(blobName);
  const uploadBlobResponse = await blockBlobClient.uploadFile(file);
  console.log(
    `Upload block blob ${blobName} successfully`,
    uploadBlobResponse.requestId
  );
};

let cmd = `mongodump --forceTableScan --out=${backupDirPath} --uri=${process.env.MONGODB_URI}`;

const dbAutoBackUp = () => {
  let filePath = backupDirPath + `/db-back-up-schedule/companies.bson`;
  exec(cmd, (error, stdout, stderr) => {
    console.log([cmd, error, backupDirPath]);
    storeFileOnAzure(filePath);
  });
};

dbAutoBackUp();

The content in this file imported the required dependencies, including the Azure storage client SDK, and specified the path where the backup file will be housed. Next, it created:

  • cmd is a mongodump command that will be executed and used to generate the backup file. The --out flag specifies the path to the folder where the file will be housed while --uri specifies the MongoDB connection string.
  • The *storeFileOnAzure()* function takes the exact absolute path of the backup file and uploads it to the created Azure storage container using the Azure Storage Blob client library.
  • The *dbAutoBackUp()* function uses the ` exec``() ` function from JavaScript to create a new shell and executes the specified MongoDump command. Also, the filepath references the exact location of the generated bson file (companies.bson in this case).

Note: companiesdb and companies.bsonrepresent of the database name and table name for the application as seen on MongoDB Atlas. So, if your database name is userdb and table name is users, then your file path would point to userdb/user.bson file.

Collections on MongoDB Atlas

Creating and implementing a scheduled pipeline

There are two different options for setting up scheduled pipelines from scratch:

  • Using the API
  • Using project settings

In this tutorial, we will use the API, so you will need:

  1. CircleCI API token
  2. Name of the version control system where your repository
  3. Your organization name
  4. Current project ID on CircleCI

To get the token, go to your CircleCI dashboard and click your avatar:

CircleCI API Key

You will be redirected to your User Settings page. From there, navigate to Personal API Tokens, create a new token, give your token a name and save it somewhere safe.

Now, open the .env file from the root of your project and add:

VCS_TYPE=VERSION_CONTROL_SYSTEM
ORG_NAME=ORGANISATION_NAME
PROJECT_ID=PROJECT_ID
CIRCLECI_TOKEN=YOUR_CIRCLECI_TOKEN
MONGODB_URI=YOUR_MONGODB_URL

Replace the placeholders with your values:

  • VCS_TYPE is your version control system, such as github.
  • ORG_NAME is your GitHub username or organization name.
  • PROJECT_ID is your project ID on CircleCI. It is db-clean-up for the sample project.
  • CIRCLECI_TOKEN: is your CircleCI token.
  • MONGODB_URI is your MongoDB URI string as extracted from MongoDB Atlas dashboard.

The next thing to do is create a new file named schedule.js within the root of your project and use this content for it:

const axios = require("axios").default;
require("dotenv").config();
const API_BASE_URL = "https://circleci.com/api/v2/project";

const vcs = process.env.VCS_TYPE;
const org = process.env.ORG_NAME;
const project = process.env.PROJECT_ID;
const token = process.env.CIRCLECI_TOKEN;
const postScheduleEndpoint = `${API_BASE_URL}/${vcs}/${org}/${project}/schedule`;

async function scheduleDatabaseBackup() {
  try {
    let res = await axios.post(
      postScheduleEndpoint,
      {
        name: "Database backup",
        description: "Schedule database backup for your app in production",
        "attribution-actor": "current",
        parameters: {
          branch: "main",
          "run-schedule": true,
        },
        timetable: {
          "per-hour": 30,
          "hours-of-day": [
            0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
            19, 20, 21, 22, 23,
          ],
          "days-of-week": ["MON", "TUE", "WED", "THU", "FRI", "SAT", "SUN"],
        },
      },
     {
        headers: { "circle-token": token },
      }
    );
    console.log(res.data);
  } catch (error) {
    console.log(error.response);
  }
}
scheduleDatabaseBackup();

This code creates a function named **scheduleDatabaseBackup()** to post pipeline schedule details to CircleCI API.

The payload includes:

  • name, which is the schedule name. It needs to be unique.
  • description is an optional field and is used to describe the schedule.
  • attribution-actor can be either system for a neutral actor or current, which takes your current user’s permissions (as per the token you use).
  • The parameters object specifies which branch to trigger. It includes an additional value for checking when to run the pipeline.
  • timetable defines when and how frequently to run the scheduled pipelines. The fields to use here are per-hour, hours-of-day, and days-of-week.

Note that timetable does not take a cron expression, making it more easily parsable by humans reasoning with the API. For this tutorial, the schedule is set to run 30 times within an hour, which is about every 2 minutes.

The code also passes the CircleCI token to the header.

Updating configuration file

Before running the scheduled pipeline, we need to update the CircleCI pipeline configuration script. Open .circleci/config.yml file and replace its content with this:

version: 2.1
orbs:
  heroku: circleci/heroku@1.2.6
jobs:
  build:
    executor: heroku/default
    steps:
      - checkout
      - heroku/install
      - heroku/deploy-via-git:
          force: true
  schedule_backup:
    working_directory: ~/project
    docker:
      - image: cimg/node:17.4.0
    steps:
      - checkout
      - run:
          name: Install MongoDB Tools.
          command: |
            npm install
            sudo apt-get update
            sudo apt-get install -y mongodb
      - run:
          name: Run database back up
          command: npm run backup
parameters:
  run-schedule:
    type: boolean
    default: false
workflows:
  deploy:
    when:
      not: << pipeline.parameters.run-schedule >>
    jobs:
      - build
  backup:
    when: << pipeline.parameters.run-schedule >>
    jobs:
      - schedule_backup

The config now includes a new job named schedule_backup. It uses the Docker image to install Node.js and MongoDB tools. The config includes parameters and uses the run-schedule pipeline variable to check when to run the workflows.

For all workflows, add when expressions that indicate to run them when run-schedule is true and not to run other workflows unless run-schedule is false.

Creating more environment variables on CircleCI

Just before you add and push all updates to GitHub, add the MongoDB connection string, Azure account name, and key as environment variables on your CircleCI project.

From the current project pipelines page, click the Project Settings button. Next, select Environment Variables from the side menu. Add these variables:

  • ACCOUNT_KEY is your Microsoft Azure storage account key.
  • ACCOUNT_NAME is the Microsoft Azure storage account name (dbblobs for this tutorial).
  • MONGODB_URI is your MongoDB connection string.

List of environment variables

Now, update git and push your code back to GitHub.

Running the scheduled pipeline

The schedule configuration file is updated and ready to go. To create the scheduled pipeline, run this from the root of your project:

node schedule.js

The output should be similar to this:

{
    "description": "Schedule database backup for your app in production",
  "updated-at": "2022-03-07T07:07:25.408Z",
  "name": "Database backup",
  "id": "caa627c8-2768-4ac7-8150-e808fb566cc6",
  "project-slug": "gh/CIRCLECI-GWP/db-back-up-schedule",
  "created-at": "2022-03-07T07:07:25.408Z",
  "parameters": { "branch": "main", "run-schedule": true },
  "actor": {
    "login": "daumie",
    "name": "Dominic Motuka",
    "id": "335b50ce-fd34-4a74-bc0b-b6455aa90325"
  },
  "timetable": {
    "per-hour": 30,
    "hours-of-day": [
      0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
      21, 22, 23
    ],
    "days-of-week": ["MON", "TUE", "WED", "THU", "FRI", "SAT", "SUN"]
  }
}

Review your scheduled pipelines in action

Return to the pipeline page on CircleCI. Your pipeline will be triggered every two minutes.

Scheduled pipelines in action

This is a good time to open the container within your Azure storage account to confirm that the file has been uploaded successfully.

View blobs on Azure

Bonus section: retrieving a schedule list and deleting a schedule

In this last section, you will learn:

  • How to retrieve all the schedules for a particular project
  • How to delete any schedule

Retrieve the list of schedules for a project

To fetch all schedules, create a new file named get.js within the root of the project. Enter this content:

const axios = require("axios").default;
require("dotenv").config();

const API_BASE_URL = "https://circleci.com/api/v2/project";
const vcs = process.env.VCS_TYPE;
const org = process.env.ORG_NAME;
const project = process.env.PROJECT_ID;
const token = process.env.CIRCLECI_TOKEN;

const getSchedulesEndpoint = `${API_BASE_URL}/${vcs}/${org}/${project}/schedule/`;

async function getSchedules() {
  let res = await axios.get(getSchedulesEndpoint, {
    headers: {
      "circle-token": `${token}`,
    },
  });

  console.log(res.data.items[0]);
}

getSchedules();

This snippet fetches and logs the schedules in your terminal, but just the first item within the schedules array. To see all items, replace res.data.items[0] with res.data.items.

Now run the file with node get.js. Your output should be similar to this:

{
  description: 'Schedule database backup for your app in production',
  'updated-at': '2022-03-07T10:49:58.123Z',
  name: 'Database backup',
  id: '6aa72c63-b4c4-4dc0-b099-b8661a7a2052',
  'project-slug': 'gh/yemiwebby/db-back-up-schedule',
  'created-at': '2022-03-07T10:49:58.123Z',
  parameters: { branch: 'main', 'run-schedule': true },
  actor: {
    login: 'yemiwebby',
    name: 'Oluyemi',
    id: '7b490556-c1bb-4b42-a201-c1785a00005b'
  },
  timetable: {
    'per-hour': 30,
    'hours-of-day': [
       0,  1,  2,  3,  4,  5,  6,  7,
       8,  9, 10, 11, 12, 13, 14, 15,
      16, 17, 18, 19, 20, 21, 22, 23
    ],
    'days-of-week': [
      'MON', 'TUE',
      'WED', 'THU',
      'FRI', 'SAT',
      'SUN'
    ]
}
}

Delete any schedule

Deleting a schedule requires its unique ID. We can use the ID of the schedule from the previous section for this demonstration.

Create another file named delete.js and paste this code in it:

const axios = require("axios").default;
require("dotenv").config();

const API_BASE_URL = "https://circleci.com/api/v2/schedule";
const vcs = process.env.VCS_TYPE;
const org = process.env.ORG_NAME;
const project = process.env.PROJECT_ID;
const token = process.env.CIRCLECI_TOKEN;

const schedule_ids = ["YOUR_SCHEDULE_ID"];

async function deleteScheduleById() {
  for (let i = 0; i < schedule_ids.length; i++) {
    let deleteScheduleEndpoint = `${API_BASE_URL}/${schedule_ids[i]}`;
    let res = await axios.delete(deleteScheduleEndpoint, {
      headers: { "circle-token": token },
    });
    console.log(res.data);
  }
}

deleteScheduleById();

Replace the YOUR_SCHEDULE_ID placeholder with the ID extracted from the previous section and save the file. Next, run node delete.js from the terminal. The output:

{ message: 'Schedule deleted.' }

Conclusion

In this tutorial, you downloaded a sample project from GitHub and ran it locally on your machine before deploying it to the Heroku platform via CircleCI. You then created some records in your MongoDB database and created a script to generate a backup collection of the database using MongoDB tools. You stored the backup file on Microsoft Azure and used the scheduled pipeline feature from CircleCI to automate the file backup process at a reasonable interval.

This tutorial covers an important use case for scheduled pipelines because it automates a task that would otherwise have been done manually. Tasks like scheduling database clean-ups are too important to be left to humans. They take up valuable developer time and in busy or stressful times it is easy to forget them. Scheduling pipelines for database clean-up solves these problems so you and your team have more time to develop and release applications.

I hope that you found this tutorial helpful. The complete source code can be found here on GitHub.


Oluyemi is a tech enthusiast with a background in Telecommunication Engineering. With a keen interest in solving day-to-day problems encountered by users, he ventured into programming and has since directed his problem solving skills at building software for both web and mobile. A full stack software engineer with a passion for sharing knowledge, Oluyemi has published a good number of technical articles and blog posts on several blogs around the world. Being tech savvy, his hobbies include trying out new programming languages and frameworks.

Copy to clipboard