Sometimes you’re backed into a dark corner, and a creeping thought comes into your head: “Rewrite this dark corner that everyone’s afraid to touch.”
Rewriting/refactoring comes in several flavours:
- Garden variety full system re-architecting
- Swapping out one micro-service for another
- Redoing a namespace in your codebase
- Just tidying up a single function
The one thing that makes all of these things equally terrifying: do you really know what’s going to break when you refactor?
Sometimes you do know exactly what’s going to break: someone did the work for you, and you have a nice suite of test-driven…tests. Other times, you can find all usages of your bit of code and just check them off. But often you’ll find that the section of code hasn’t been touched since your company’s founder wrote it, and that’s because there are no tests, and thus no reason or hope.
The best way to build your confidence with this sort of thing is to find a test suite. For full system and micro service rewrites, folks usually use something like canary builds or Scientist. You’ll get the perks of seeing real traffic and scale, what the answer should have been, and what your changes will affect. After enough time in the green, switches are cut over.
But what happens when production traffic makes local storage dangerous, or the iteration loop for your fixes is more trouble than it’s worth?
Years ago, I had to fix a bug in some graph building code. PHP was the name of the game, and the couple of files concerned were…convoluted. A quick attempt at the bug resulted in completely unrelated (to my mortal mind) code going haywire. Prod down. Knowing what the function calls were returning was borderline impossible, and I was pretty nervous to rip it all out.
But that’s exactly what I did.
I set about making “logical refactors” to clean things up. The whole time I kept thinking: “Damn, I am the best, no tests broken!” My boss came by my desk a couple days later to see how it was all going and saw a line diff…in the thousands. He stifled a cough and kindly asked, “How do the tests look for this?”
“All green in CircleCI!” I happily replied. If you haven’t already guessed why everything was green, it’s because there were no tests.
Without any tests to break, I had been left afloat in a sea of ignorance. When I realized the problem, I threw the entire branch out and began putting tests in place, building the suite I wished I had had. I submitted a thousand-line PR for that, then began working on the failing test and yet another thousand-line refactor.
The benefit of using production traffic in your suite is that you can’t plan for what you’re going to see. The benefit of unit/integration testing is that you can perform more iterative software development.
Enter, generative testing.
Last December, I gave a talk at Clojure Conj titled “Charting the English Language…In Pure Clojure”. For those who don’t have time to watch the full thing, here’s the high level: pretty pictures using big data are fun.
The goal of this talk was to follow a given process for producing a data-generated image and reproduce that image using only Clojure. This was a fun foray into various corners of the JVM, Linux, and machine learning.
However, the exercise forced me to effectively test my implementation against the original.* For the final image, that doesn’t really work since the seed data is random. Additionally, the original code base didn’t have any tests, and translating Python/C++ to Clojure is kind of an…eye crossing experience. My ultimate solution was to use generative testing and modify the original code base along with my own to be able to compare small operations generated at will.
My steps for getting the testing interop setup are paleolithic at best. The full code base is available here. The gist of what I did was make Python interop work by encoding Clojure structures as Python commands: a Clojure Matrixx becomes a string, which NumPy can understand. Spit that to the command line, wait for the result, and then translate NumPy into JSON and compare everything in Clojure. I decided to only use JSON parsing in one direction to avoid having to muck with the target implementation I wanted to copy.
*Initially, I started without any tests, hoping to translate the Python/C++ code straight to Clojure. I had a lot of the code in place, but my program’s executions kept coming out with all of the words in a single point, NPEs, or being left overnight without any signs of terminating.
Testing is one of my favourite parts of building software. I have always leaned more toward the abstract/theoretical side of Computer Science, and tests are, in my mind, proofs of correctness. When done elegantly, you can effectively sign your code: QED.
Generative testing is just another tool in this realm. Typically, you can somewhat predict the shape of what’s going into a function and can always make a claim about what’s coming out. Smaller scale tests are still useful for concise statements about the “what”. But instead of having to go through the standard checklist of “negative, zero, empty, nil, one, many, full, first, middle, last…” in test cases, you can just wave your hands and say, “Randomly speaking, I know what I’m talking about.”