Hazards And Safeguards for Software Rewrites

Posted April 30, 2014 by Curtis Cooley

East_German_House_Demolition_-_Flickr_-_The_Central_Intelligence_Agency

Throwing away a legacy system and rewriting it from scratch may be tempting, but it is also hazardous. Here are typical issues you'll encounter:

  • 100% feature matching is difficult
  • You risk building the new system as poorly as the old system
  • It's easy to underestimate the effort
  • Tension may build between the rewrite team and the support team

One major hazard is making sure the new system is feature complete. If you miss an oft–used feature, you'll injure your users.

Ensuring you have duplicated all the features will become the most time–consuming work during the rewrite.

The system you are replacing is still under maintenance and subject to change. As the current system changes, it gets harder to make sure the new system is feature complete.

One goal of a rewrite should be a cleaner code base than the existing system. Often the reason to do a rewrite is because the existing system is such a big ball of mud that it's assumed a rewrite is cheaper than fixing it.

This assumption is often wrong. No matter how difficult refactoring may appear, rewriting will be worse.

I've seen over and over again, in an effort to maintain feature completeness, developers copying and pasting code from the existing system into the new system. I've even seen this when the new system is in a completely different language.

Copying and pasting code from the existing system will lead to just as big a mess as you currently have; defeating the purpose of the rewrite.

It seems completely logical that since you've written it once, rewriting it will cost less and take less time.

Sadly this is seldom the case.

It is safest to estimate that the rewrite will take similar effort as the original.

The developers who wrote the system may not be around and they are the ones who created the big ball of mud in the first place.

The existing system has been under constant maintenance and has evolved to the point where comparing development times of the old version and the new is comparing apples to oranges.

Any time saved during development will likely be lost to making sure the new system is feature complete.

I've witnessed first hand the frustration felt by the maintainers of a legacy application with the development team of the new one.

The usual strategy is a big bang replacement once the new app is ready. While waiting for that to occur, the existing app still requires maintenance.

Often morale of the maintainers slumps because they know their hard work is short lived, or they feel that the rewrite will fail.

They want to make clean changes to the existing system but are not given the time and resources because it will soon be replaced.

Often it's more cost–effective to replace the most problematic pieces of the existing application.

So, rewrites are hazardous, but if you insist on the rewrite, here are some safeguards to consider.

Safeguards for Rewrites

You are probably aware of the 80:20 rule of software features. On average, 80% of the time your users are using only 20% of the application's features.

The most often used features and highest risk stories should be the features you rewrite and deploy first.

Instrument your system to log how many times each feature is used. You may even find some features are never used and do not need to be rewritten.

Before you start the rewrite, while you are collecting usage statistics, figure out how to deploy the new system alongside the existing system. Figure out how to run them at the same time and direct users to the new system.

Your most effective safeguard against broken features is continuous deployment

Before a line of code is written, figure out how to deploy early and often. Deploy to a live system, with real users.

Even better, deploy a feature at a time and cut users over to it. Martin Fowler calls this a strangler application. Users use more and more of the rewritten features until the legacy system is no longer required.

Deploying as you build the new features will also alleviate developer tension because maintenance effort migrates to the cleanly written features.

Build a suite of automated characterization tests that verify your assumptions about what the current system does. Every feature you wish to rewrite should have a corresponding test.

You do not need a complete suite of tests before you start. Build characterization tests just in time as you decide which features to build.

Consider using tests instead of stories to document and track the features rewritten.

Before you start to rewrite a feature, write a test you can run against the existing system to verify your understanding of how the feature is expected to behave.

As you rewrite the feature, run that test against the new code. When the test is green, deploy. Tests will help you ensure your rewritten application is feature complete, not just code complete.

I worked on a rewrite from C++ to Java where I saw many instances of C++ code being copied and translated to Java. This is what I mean by code complete versus feature complete.

Had characterization tests been written for the C++ code, confidence in the freshly written Java code would have been high enough to avoid this poor practice.

 
The safest way to ensure a rewritten feature completely replaces a legacy one is automated testing.
 

No matter how thin your stories, they're still too big.

You're goal is stories that can be built, tested, and deployed in one to three days with preference towards the shorter time span.

Thinly slicing stories is a skill you need to develop whether writing or rewriting an application. Thinly sliced stories are built, tested, and deployed sooner, greatly shortening the feedback cycle.

Summary

A complete rewrite of an existing application or system should be your last choice. It seems appealing, especially if the existing system is bug–ridden and the code is a mess, but take the time to rethink the rewrite and proceed with caution.

If you must rewrite, the preferred approach is a strangler application. Replace and deploy features that continuously deprecate the existing system. Often you can stop before the existing system is completely replaced.

When rewriting, identify the hazards and implement the appropriate safeguards to ensure the work goes smoothly and safely.

  • Tim Ottinger
  • Dave Rooney

    I’m coaching a team that has just started an effort to do this right now. The circumstances are a little different, though. First system was written last fall in a panic in order to meet a commitment made by an executive (don’t ask). That one has just a few automated tests, and was intended to be “throwaway”. The people building the new system are the same as who built the older one.

    The new system is supposed to be a copy of the 1st one with some changes for the different market to which it’s to be delivered. We started by building a story map with the older system as a guide, then individual stories that are very, very thin slices. How thin? The very first story is along the lines of “hard code this, hard code that, hard code this other thing and make sure that we can send that stuff to process X”. These *will* be delivered supported by tests.

    Oh, and as for the throwaway system, it’s now in ongoing maintenance because of the issues coming up due to the lack of testing. That has affected the capacity of the team to deliver work on the new system. :)

    • CurtisCooley

      How many “throwaway” systems are ever actually thrown away?