Last week I wrote It’s the Future, a piece that satirized the container ecosystem, lightly mocking Docker and Google and CoreOS and a bunch of other technologies. Lots of Docker enthusiasts enjoying being the butt of the joke, but it was also much loved and shared by lots of people yelling “I told you this was all bullshit”.
It’s very easy to see why people might think the container ecosystem is bullshit, in exactly the way I satirized. After all, it’s not exactly clear at first glance what Docker is. It’s containerization, which is like virtualization, but not quite. It’s got a Dockerfile which is kinda like Chef, but it’s combined with something called layers which involves a weird filesystem or something. It solves similar problems to AWS and Heroku and VMware and Vagrant, but in each case it’s slightly different in a way that’s not particularly clear at first and also it’s really not clear why at all. It’s got 27 competing versions of tools that do you-can’t-tell-exactly-what, with funny names like machine and swarm and flannel and weave and etcd and rkt and kubernetes and compose and flocker. It’s somehow linked to microservices which are new and shiny but seem like a fantastically stupid idea considering how hard it is to keep a single service running in the first place. And after all that, it’s got this weird culty vibe to it, with dozens of startups and big corps all competing to get “developer mindshare” which might somehow someday relate to money and it’s all very 1999 and there’s definitely some kind of Koolaid being drunk.
It’s really not unreasonable to look at the whole Docker and container thing and conclude that it’s all bullshit.
Except it’s not.
It’s actually the future of how we build applications.
Why the Haterade?
Many of the folks who reacted to It’s the Future were those who felt that it was 100% accurate, not satirical at all, and who questioned the hype around this whole container thing. Why?
The Docker and container ecosystem (hereafter “Docker”) is taking a bunch of staples of the application developer world, such as virtualization, service-oriented architectures, and operating systems, and redelivering them with different goals and benefits. As it does so, it raises the hackles of a large portion of the developer community: curmudgeons who hate anything new.
The software industry, contrary to what you might expect, is absolutely filled with people who hate progress. The sort of people who would walk into the Sistine Chapel after Michelangelo was done and declare that they already had a perfectly good picture of god, they prefer their ceilings to be white, and that frescos aren’t that cool anyway.
At the same time, most of the software industry makes its decisions like a high school teenager: they obsessively check for what’s cool in their clique, maybe look around at what’s on Instagram and Facebook, and then follow blindly where they are led. Around these technologies they form cliques of conformity, even going so far as to carve their own identities around the technological niche they fit into—they even cover their laptops with their gang colors–and hate and complain about things that are strange or different.
Into that world drops Docker: a new way of doing almost everything. It throws away old rules about operating systems, and deployment, and ops, and packaging, and firewalls, and PaaSes, and everything else. Some developers instantly love it, sometimes for valid reasons such as the problems it solves, and sometimes because it’s a shiny toy that allows them to be cool before the other kids get to it. Other developers hate it—its pure hype, they say; it’s just the same as what came before and I don’t see why everyone is talking about it, they say—often for reasons that are more tribal than rational.
So reactions to Docker are not necessarily based on the technology itself. Most haters are not really reacting to Docker’s solutions to important and complex problems. Mostly, this is because those problems are ones you might not have noticed if you haven’t spent time scaling big systems. If you don’t intuitively and deeply understand what’s meant by “cattle not pets” and why that’s important, then a lot of the choices made by Docker and related tools are going to seem weird and scary to you.
Meanwhile, over the last two decades, the distributed systems people have been doing some rather boring shit. They’ve experimented with complex protocols like CORBA and SOAP, and learned how to deal with issues like the CAP theorem, and how clock synchronization is impossible, and the Two Generals Problem, that appear largely theoretical to most. And those problems and their solutions have been rather uninteresting to anyone who is simply trying to take their knowledge of how to code and use it to ship web applications.
But then something interesting happened. Web applications got large enough that they started to need to scale. Enough people arrived on the internet that web apps could no longer sit on a single VPS, or just scale up vertically. And as we started to scale, we started seeing all these bugs in our applications, bugs with interesting names like “race conditions” and “network partitions” and “deadlock” and “Byzantine failures”. These were problems the distributed systems folks had been working on for quite some time, problems whose solutions were not only difficult, but in many cases theoretically impossible.
In the early years of this scalability crisis, Heroku happened. And Heroku made it really easy to scale infrastructure horizontally, allowing us to pretend once again that we were really just making simple web apps. And we bought ourselves, as an industry, maybe 5 years of pretending and self-delusion.
We’ve now hit the limit of that self-delusion, and as we come out of it, we find ourselves trying to build scalability early, and re-architecting broken things so that they can scale, and learning about the downsides of monolithic architectures and why using a single database won’t just keep working for us. And we come up with phrases like Immutable Architecture, “Pets vs Cattle”, Microservices, and a whole set of best and worst practices to try and make some of this easier.
At this point, during this shift, Docker comes in and tries to solve a lot of problems. But instead of telling us we can pretend the problem of scaling doesn’t exist and we can keep doing things in basically the same way, like Heroku did, Docker tells us that distributed systems are fundamentally what we’ve been doing all along, and so we need to accept it and start to work within this model. Instead of dealing with simple things like web frameworks, databases, and operating systems, we are now presented with tools like Swarm and Weave and Kubernetes and etcd, tools that don’t pretend that everything is simple, and that actually require us to step up our game to not only solve problems, but to understand deeply the problems that we are solving.
The upside is that we gain the ability to build scalable architecture so long as we don’t pretend we can abstract it away. We now need to know what a network partition is and how to deal with it, and how to choose between an AP and a CP system, and how to build architectures that can actually scale under the duresses of real networks and machines. Sometimes there’s an electrical storm in Virginia, and sometimes things get set on fire, and sometimes a shark bites an undersea cable, and sometimes there is latency, and delivery failures, and machines die, and abstractions leak.
Everything needs to be more resilient, more reliable, and we need to acknowledge that those are things we need to think about as part of developing applications. And we need to do it not because it’s shiny, or because it’s some mythical best practices, but because people like Amazon and Netflix and Google have put 15 years of sweat and blood and industry experience into working this shit out and telling us how to build systems at real scale.
Real problems solved
So what exactly is Docker solving for us, then? Everything that we’re doing as we build web applications is extremely fragile, and Docker is forcing some sanity to it:
Up until now we’ve been deploying machines (the ops part of DevOps) separately from applications (the dev part). And we’ve even had two different teams administering these parts of the application stack. Which is ludicrous because the application relies on the machine and the OS as well as the code, and thinking of them separately makes no sense. Containers unify the OS and the app within the developer’s toolkit.
Up until now, we’ve been running our service-oriented architectures on AWS and Heroku and other IaaSes and PaaSes that lack any real tools for managing service-oriented architectures. Kubernetes and Swarm manage and orchestrate these services.
Up until now, we have used entire operating systems to deploy our applications, with all of the security footprint that they entail, rather than the absolute minimal thing which we could deploy. Containers allow you to expose a very minimal application, with only the ports you need, which can even be as small as a single static binary.
Up until now, we have been fiddling with machines after they went live, either using “configuration management” tools or by redeploying an application to the same machine multiple times. Since containers are scaled up and down by orchestration frameworks, only immutable images are started, and running machines are never reused, removing potential points of failure.
Up until now, we have been using languages and frameworks that are largely designed for single applications on a single machine. The equivalent of Rails’ routes for service-oriented architectures hasn’t really existed before. Now Kubernetes and Compose allow you to specify topologies that cross services.
Up until now, we’ve been deploying heavy-weight virtualized servers in sizes that AWS provides. We couldn’t say “I want 0.1 of a CPU and 200MB of RAM”. We’ve been wasting both virtualization overhead as well as using more resources than our applications need. Containers can be deployed with much smaller requirements, and do a better job of sharing.
Up until now, we’ve been deploying applications and services using multi-user operating systems. Unix was built to have dozens of users running on it simultaneously, sharing binaries and databases and filesystems and services. This is a complete mismatch for what we do when we build web services. Again, containers can hold just simple binaries instead of entire OSes, which results in a lot less to think about in your application or service.
The only constant is change
Our industry moves so quickly to deify new and exciting technologies that it doesn’t wait for those technologies to mature. Docker is moving at an incredible pace, meaning that it hasn’t come close to stabilizing or maturing. We have multiple options for container run-times and image formats and orchestration tools and host OSes, each with a different level of utility, scope, traction and community support.
Looking around the rest of our industry, things don’t get stable until they become old and boring. As an example, how many protocols had to die before we got REST? We built REST and AJAX and JSON over the corpses of SOAP and CORBA, using the lessons we learned while building them. That’s two major technology transitions, over the course of about 10 years. Yet, we still haven’t got the same level of tooling for REST-based APIs that we had for SOAP a decade ago, and SOAP in particular has yet to fully die.
The same thing is happening in the frontend, and indeed lots of folks compared my parody of the Docker ecosystem to the shit-show that’s going on in frontend development. And the same thing has been going on with programming languages since we escaped Java a decade ago. Consistently, until problems are good and solved, developers will continually come up with new solutions. And the Docker ecosystem has tons of problems to be solved.
So we can expect that Docker isn’t that mature yet. There will still be many edge cases and weirdnesses that you’re going to hit when you try it, and some of its decisions are weird and may actually be plain wrong when we look back on them from a few years hence. Best practices still have to be tried and failed and retried and refailed until we get them right.
It’s going to take a number of years until we figure all this stuff out and it settles down. But that doesn’t mean that containers are bullshit, or that we can ignore them. We are always faced with a choice between staying still with the technologies we know, or taking a bit of a leap and trying the new thing, learning the lessons and adapting and iterating and improving the industry around us.
If you’re looking for me, I’ll be in the future.