At CircleCI, we support the open source community; we offer open source software (OSS) projects free recources and unlimited usage. Because of that, we process nearly 1.2 million OSS builds per month. Recently, we reached out to some of the most interesting OSS projects building on CircleCI to ask them about the importance of their project being open source, the importance of CI for their projects, and tips from their configs that other OSS projects can benefit from. For the first post in our series covering these projects, we were able to find out more about the Alex’s Lemonade Stand Foundation from Rich Jones, an engineer on the project.

Tell me about the Alex’s Lemonade Stand Foundation. What is your OSS project for?

Alex’s Lemonade Stand Foundation (ALSF) changes the lives of children with cancer by funding impactful research, raising awareness, supporting families and empowering everyone to help cure childhood cancer. Some of the research that ALSF funds goes to the Childhood Cancer Data Lab (CCDL), where I work. At the CCDL, we use big data (like, really really big!) and machine learning/AI to look for new cures, better cures and cheaper cures for childhood cancers.

One of our current projects is refine.bio, where we pre-process all of the world’s public RNA data into searchable and usable datasets for disease researchers and computational biologists. Our ultimate goal is to be like Google for your transcriptome.

What do you do for the project?

I’m an engineer on the project along with one other backend engineer, a front-end engineer, two scientists, and a designer. I’m responsible for programming the ETL pipelines and the API, and architecting/deploying/maintaining our massive cloud compute infrastructure. I’m here because I’m a big softy and I got tricked into working for a good cause. But seriously, it’s awesome to get to work on challenging and interesting technical problems in a new mission-driven organization. I also get to learn quite a bit about biological data science, which is all new to me.

Why is it important for this project to be open source?

All of the work we do is free and open source software. That’s important to me as a developer because I think software freedom is important and I like working in a globally collaborative environment. It’s also just as important for the reproducibility of our scientific results and for establishing trust with the scientists who use the data we produce.

We imagine a future where current software engineering best practices, like continuous integration and reproducible builds, become standard in scientific research pipelines so that scientific teams can quickly, easily, and accurately reproduce and improve upon the results of other teams. Currently, that process is actually pretty hard as some groups are hesitant to share their code or data, which diminishes the utility of their results. Our one and only goal is to cure cancer in children, so we want to produce all the utility we can.

In what ways does CI improve the experience of your outside contributors?

Our use of CI helps us quickly identify any problems that might be introduced by changes from internal and external contributors and any of the many upstream dependencies we have across the many ecosystems and third party services that we rely on. It’s usually at least once a week that we catch something of medium to high severity that we wouldn’t have noticed otherwise. These issues would have caused broken deploys which can mean a significant amount of money at the scale that we operate at. Since we’re a non-profit, that matters a lot as we want every donation that people make to go as far as it can.

Can you share an interesting code snippet from your config and describe what you are doing in it?

Our CircleCI config is pretty long and crazy, but here’s a fun part. We use Slack a lot for team communications and notifications about our services, so we have a script that can alert all the members of a Slack channel with a custom message whenever something important happens. Here we have Gritty (we’re based in Philadelphia) announce to our #robots channel whenever a new staging or production deploy completes successfully:

  deploy:
	machine: true
	working_directory: ~/refinebio
	steps:
  	- checkout
  	- run: ./.circleci/install_git_decrypt.sh # We use `AGWA/git-crypt` for decrypting public secrets
  	- run: ./.circleci/git_decrypt.sh
  	- run:
      	command: ./.circleci/remote_deploy.sh
      	no_output_timeout: 4h # Yes, deploying can take a long time!
  	- run: ./.circleci/slackpost.sh robots deploybot

Where slackpost.sh is:

#!/bin/bash
# Usage: slackpost "<channel>" "<username>" "<message>"
# Originally by @dopiaza, modified for CircleCI by @AlexsLemonade

webhook_url="<OUR_SLACK_WEBHOOK_URL>"
channel=$1
username=$2

# Check if we're on the master or dev branch
master_check=$(git branch --contains tags/$CIRCLE_TAG | grep '^  master$' || true)
dev_check=$(git branch --contains tags/$CIRCLE_TAG | grep '^  dev$' || true)

if [[ ! -z $master_check ]]; then
	CIRCLE_BRANCH=master
elif [[ ! -z $dev_check ]]; then
	CIRCLE_BRANCH=dev
fi

text="New deployment! Woo! $CIRCLE_PROJECT_USERNAME: $CIRCLE_PULL_REQUEST $CIRCLE_BRANCH $CIRCLE_TAG"
escapedText=$(echo $text | sed 's/"/\"/g' | sed "s/'/\'/g" )
json="{\"channel\": \"$channel\", \"username\":\"$username\", \"icon_emoji\":\":gritty:\", \"attachments\":[{\"color\":\"danger\" , \"text\": \"$escapedText\"}]}"
curl -s -d "payload=$json" "$webhook_url"

Anything else you would like our readers to know about your project?

We’re available on GitHub at /AlexsLemonade/refinebio/ and we like it when we get stars!

If you are interested in joining our team, take a look through our current job openings. If you want to support our work, you can make tax-deductable donations to Alex’s Lemonade Stand Foundation here, tell them that the Childhood Cancer Data Lab and CircleCI sent you!

Read more: