Hazards And Safeguards for Software Rewrites

Posted April 30, 2014 by Curtis Cooley

East_German_House_Demolition_-_Flickr_-_The_Central_Intelligence_Agency

Throwing away a legacy system and rewriting it from scratch may be tempting, but it is also hazardous. Here are typical issues you'll encounter:

  • 100% feature matching is difficult
  • You risk building the new system as poorly as the old system
  • It's easy to underestimate the effort
  • Tension may build between the rewrite team and the support team

One major hazard is making sure the new system is feature complete. If you miss an oft–used feature, you'll injure your users.

Ensuring you have duplicated all the features will become the most time–consuming work during the rewrite.

The system you are replacing is still under maintenance and subject to change. As the current system changes, it gets harder to make sure the new system is feature complete.

One goal of a rewrite should be a cleaner code base than the existing system. Often the reason to do a rewrite is because the existing system is such a big ball of mud that it's assumed a rewrite is cheaper than fixing it.

This assumption is often wrong. No matter how difficult refactoring may appear, rewriting will be worse.

I've seen over and over again, in an effort to maintain feature completeness, developers copying and pasting code from the existing system into the new system. I've even seen this when the new system is in a completely different language.

Copying and pasting code from the existing system will lead to just as big a mess as you currently have; defeating the purpose of the rewrite.

It seems completely logical that since you've written it once, rewriting it will cost less and take less time.

Sadly this is seldom the case.

It is safest to estimate that the rewrite will take similar effort as the original.

The developers who wrote the system may not be around and they are the ones who created the big ball of mud in the first place.

The existing system has been under constant maintenance and has evolved to the point where comparing development times of the old version and the new is comparing apples to oranges.

Any time saved during development will likely be lost to making sure the new system is feature complete.

I've witnessed first hand the frustration felt by the maintainers of a legacy application with the development team of the new one.

The usual strategy is a big bang replacement once the new app is ready. While waiting for that to occur, the existing app still requires maintenance.

Often morale of the maintainers slumps because they know their hard work is short lived, or they feel that the rewrite will fail.

They want to make clean changes to the existing system but are not given the time and resources because it will soon be replaced.

Often it's more cost–effective to replace the most problematic pieces of the existing application.

So, rewrites are hazardous, but if you insist on the rewrite, here are some safeguards to consider.

Safeguards for Rewrites

You are probably aware of the 80:20 rule of software features. On average, 80% of the time your users are using only 20% of the application's features.

The most often used features and highest risk stories should be the features you rewrite and deploy first.

Instrument your system to log how many times each feature is used. You may even find some features are never used and do not need to be rewritten.

Before you start the rewrite, while you are collecting usage statistics, figure out how to deploy the new system alongside the existing system. Figure out how to run them at the same time and direct users to the new system.

Your most effective safeguard against broken features is continuous deployment

Before a line of code is written, figure out how to deploy early and often. Deploy to a live system, with real users.

Even better, deploy a feature at a time and cut users over to it. Martin Fowler calls this a strangler application. Users use more and more of the rewritten features until the legacy system is no longer required.

Deploying as you build the new features will also alleviate developer tension because maintenance effort migrates to the cleanly written features.

Build a suite of automated characterization tests that verify your assumptions about what the current system does. Every feature you wish to rewrite should have a corresponding test.

You do not need a complete suite of tests before you start. Build characterization tests just in time as you decide which features to build.

Consider using tests instead of stories to document and track the features rewritten.

Before you start to rewrite a feature, write a test you can run against the existing system to verify your understanding of how the feature is expected to behave.

As you rewrite the feature, run that test against the new code. When the test is green, deploy. Tests will help you ensure your rewritten application is feature complete, not just code complete.

I worked on a rewrite from C++ to Java where I saw many instances of C++ code being copied and translated to Java. This is what I mean by code complete versus feature complete.

Had characterization tests been written for the C++ code, confidence in the freshly written Java code would have been high enough to avoid this poor practice.

 
The safest way to ensure a rewritten feature completely replaces a legacy one is automated testing.
 

No matter how thin your stories, they're still too big.

You're goal is stories that can be built, tested, and deployed in one to three days with preference towards the shorter time span.

Thinly slicing stories is a skill you need to develop whether writing or rewriting an application. Thinly sliced stories are built, tested, and deployed sooner, greatly shortening the feedback cycle.

Summary

A complete rewrite of an existing application or system should be your last choice. It seems appealing, especially if the existing system is bug–ridden and the code is a mess, but take the time to rethink the rewrite and proceed with caution.

If you must rewrite, the preferred approach is a strangler application. Replace and deploy features that continuously deprecate the existing system. Often you can stop before the existing system is completely replaced.

When rewriting, identify the hazards and implement the appropriate safeguards to ensure the work goes smoothly and safely.

6 Responses to “Hazards And Safeguards for Software Rewrites”

  1. Tim Ottinger says:

    I consider rewriting a system because of tech debt to be the metaphorical equivalence of declaring bankruptcy. It’s amazing how much a team of committed refactoring programmers can change a code base in just a year or two (http://ryber.github.io/blog/2011/04/19/the-big-book-of-dead-code/).

    I’ve more sympathy for moving between languages or platforms than rewriting for reasons of poor code.

    • Jason Kerney says:

      Even under those conditions I would suggest that you whittle the old system down replacing a single part at a time starting with dependent systems rather then a full rewrite.

  2. Dave Rooney says:

    I’m coaching a team that has just started an effort to do this right now. The circumstances are a little different, though. First system was written last fall in a panic in order to meet a commitment made by an executive (don’t ask). That one has just a few automated tests, and was intended to be “throwaway”. The people building the new system are the same as who built the older one.

    The new system is supposed to be a copy of the 1st one with some changes for the different market to which it’s to be delivered. We started by building a story map with the older system as a guide, then individual stories that are very, very thin slices. How thin? The very first story is along the lines of “hard code this, hard code that, hard code this other thing and make sure that we can send that stuff to process X”. These *will* be delivered supported by tests.

    Oh, and as for the throwaway system, it’s now in ongoing maintenance because of the issues coming up due to the lack of testing. That has affected the capacity of the team to deliver work on the new system. :)

  3. Jo Van Eyck says:

    I completely agree. The only time I found a rewrite worth the investment was by moving from an application written in pure SQL + purely transformative web layer to a more traditional architecture. Even then, incremental work using a strangler application is essential. Code quality/design issues can be tackled much more cost effective than rewriting it all.

  4. Jason Kerney says:

    The position I held just before my current I got because of the desires to rewrite. I told them that my taking the position was contingent on not doing the rewrite. … Let me explain.

    The company was a small organization that did not consider itself a software company; however it was built off of a wad of custom software projects that where all interdependent. This conglomeration had reached a point that no change had successfully happened in 3 years, and the company fired all the programmers.

    So when I insisted that there not be a rewrite you can imagine the look they gave. I offered to take a 3 month contract so that we can see what happens. The contract was signed with both a good friend of mine and me.

    During those 3 months everybody agreed on no new features only bug fixes. This was easy sense the bug list was large. We tackled the most visible bugs for the company first, and the most painful bugs for the developers second.

    Within a week we had our first release, it was small but tackled a very visible bug. In 4 more days we had our second release. Then 2 days and then 5. With each release the internal customers became more and more happy. Within the first month no one was talking about rewrite.

    At the end of our first year, the company was releasing at least once a week into production. They were able to tackle new features and still tackle those old bugs. The best part is that we did not release any new bugs into the system.

    Within a year we had really started decoupling all the specialized applications so that each could function without knowledge of the internals of the others.

    My friend left after his 3 months I stayed on and hired a new team. The amazing thing was that a small team of 2 – 3 programmers drastically changed an environment with over 2 million lines of code from undeliverable to deliverable every week. We did this in less then 6 months.

    So I strongly advise against against rewrites. They waist your money and are based off of lies and cognitive biases.

Leave a Reply