Amazon’s Auto Scaling groups (ASG) are, in theory, a great way to scale. The idea is that you give them a desired capacity, and the knowledge of how to launch more machines, and they will fully automate spinning your fleet up and down as the desired capacity changes. Unfortunately, in practice, there are a couple key reasons that we can’t use them to manange our CircleCI.com fleet, one of the most important being that the default ASG termination policy kills instances too quickly. Since our instances are running builds for our customers, we can’t simply kill them instantly. We must wait for all builds to finish before we can terminate an instance.