Source Allies Logo

Sharing Our Passion for Technology

& Continuous Learning

<   Back to Blog

How To Do A (Successful) Rewrite

Writing implements on a desk

How To Do A (Successful) Rewrite

We've all been there. You're years into a project, you're slogging through the tough 20% at the end, and somebody says it. The phrase no project owner wants to hear. "We should really just rewrite this in another language." The manager groans, the devs start arguing about which language, the product owner pushes the roadmap out 6 months, and everyone gets ready for the fights.

This is how it goes on every team at a certain point, but it doesn't have to be this bad of an experience. I swear it doesn't, I've seen it. I was on a team that rewrote two major components with no downtime, no breaking changes. While doing this, we managed to bring response time down from >10 seconds to <1 second, bring reliability from ~60% success rate to more than five 9s, and save >$10,000,000 per year on our AWS spend to boot. And, today I'm going to tell you how we did it.

Background

Before we talk about the how, let's talk about the what. The project I'm talking about was a large Scala project with two parts: a play api that generated images and/or raster summaries of geospatial data in real time, and a data processor that used monix tasks and processed approximately 50TB per day during peak load. These two components contained every possible sin in the Scala ecosystem. Implicits made it impossible to understand where anything came from, it took days to understand what certain type signatures actually meant, and our logging library, of all things, required at least one graduate level category theory class to understand.

If you don't know what play or monix are, or if the majority of the words from that paragraph are unfamiliar to you, that's fine, you don't need to know them for this process, but you do need to know how much I envy you. The point is that this was not a 5k line node api, and it was not a small task to pull the api and data processor apart.

The Target

The target language for our rewrite was Go. This was not the easiest language to port to, as Scala and Go don't map particularly well towards each other, and a lot of things need to change to get idiomatic Go from Scala. I'd estimate less than 10% of the code actually looked similar at the end of this process.

This was a very large undertaking, and we couldn't afford to do it all at once. We had consumers and maintenance and everything else you can think of that a production system might have. We had plenty of budget, but we didn't have time to take two years to do it right from the beginning. It was this pressure that led to the team developing a simple three step plan to rewrite the project.

Step 1: Identify A Piece to Move

Step 1 is to identify the smallest possible chunk that you can rip out of the system. For pragmatic reasons, this is usually something "at the back" of your system. It's a piece that interacts with the backing data store, and if your system has enough problems that you're considering a rewrite, it has probably been patched repeatedly without being understood. It has, to put it mildly, a lot of problems.

Once we have identified this piece, the plan is to rip it out, put it in the new language on a server somewhere, and call it from the old language. This is where we come to the first trade-off of the whole process. We're going to introduce a network call that wasn't there before. Now, if you have the kind of problems that justify a rewrite, it should be easy enough to find enough stability or performance gains to offset a single network call, at least by the end of the process. If you can't reasonably see how you're going to do this, you should take a long hard look about if a rewrite is desirable.

Step 2: Porting

Once you've identified a piece to rewrite, step 2 is actually the easiest. You blindly port that piece to the new language, preserving the behavior (and possibly the sins of the previous project). This will feel wrong at the time, but it's an important part of the process. We need a duplicate part of the system to swap in as quickly as possible. Taking enough time to do things "the right way" is going to scare management after one or two steps, and we don't yet know what "the right way" is.

Blindly porting is an important part of the process, and at the end of it, you will know exactly what the component is doing. You may be the first person to understand it in a decade. If it's been patched repeatedly, you may be the first person to ever understand it in its current incarnation. Only now do you have the knowledge to understand what "the right way" is.

Step 3: Massaging the Port

This step is technically optional. If "the right way" is quick enough, you can go do it with the original port. If it takes a long time and you don't yet have management's confidence (don't worry, they'll come around when the system stops failing, and starts costing less money), you need to write down what needs to be done and take care of it the next time you're in the new code. Every time you're in the new code, you need to make some, however, small progress towards "the right way." It can be as simple as merging three if statements that were hacked in during maintenance tickets over the past 5 years, but you have to make some progress.

Conclusion

That's really it. Repeat those three steps enough times and everything will exist in the new system. You may have noticed that rewriting in another language isn't actually required for that last step, and this is true. If you continuously massage your code and make it better, and take the time to reexamine entire systems so that you understand them from scratch, you probably won't ever need a rewrite. Systems, however, end up in the state where they need a rewrite because we often don't have the time to do those things regularly.

A rewrite is often proposed for performance (as was our case by switching to Go) or stability reasons (often the case when switching from dynamic to statically typed languages), and these aren't unimportant, but if I had to choose one thing that makes some rewrites succeed and some fail, it would be that the teams take the time to understand the system that they're rewriting. As a nice bonus, since code is quicker to write than to read if you're trying to get a similar level of understanding, it can often be quicker than trying to understand a legacy system from the top down.

A rewrite doesn’t need to be an act of desperation. Make it a strategic investment instead. If your team is staring down a legacy system and debating whether to burn it down or live with it forever, the answer is probably neither. Move deliberately, reduce risk, and earn trust step by step. At Source Allies, we’ve helped clients modernize critical systems without downtime, without drama, and without gambling the business. If you're considering a rewrite, we’d be happy to talk through what that could look like for your business.