One of the paradigm shifts that makes a big difference in working well in software development is the ability to slice problems into small bites. The joke “How do you eat an elephant?… One bite at a time.” is referencing the age old problem solving strategy of breaking a big problem into smaller pieces.
The “not so common” adaptation of that is not to stop there, but to ask the question “But what is the best way to slice it up?”. Our goals in slicing problems (adding functionality to an application) as software developers is to slice them in such a way that each slice adds value, is testable, does no harm, can be deployed all the way to prod, and brings us closer to the goal.
Enter the Elephant
I was on a team that had a large and difficult problem imposed on them due to the desire of management to hand over half of their responsibilities to another team. The microservice codebases were not split along these responsibilities, which complicated things. There were at least 4 codebases with around 90 endpoints involved hitting 3 types of databases.
Having both teams working from the same codebase caused the deployments to become very tricky and time consuming. We needed to split the codebases in such a way that everything didn’t come to a grinding halt for several months.
Our goals in slicing this work into small slices were:
- Each slice would go to prod
- Each slice would not break the app running in prod
- We would not need to duplicate new features in new codebase
- We would not stop implementing new features while working on this
It is okay to not perfectly achieve all of these, but we wanted to get as close to them as possible.
Problems with Scope
One of the impulses we had to resist was making other improvements while doing this. One such change was moving to a more domain-driven model of microservices to prevent this kind of problem from happening again. It was a great idea, and exciting to think about, but we decided in the end it would be better to move in smaller slices and complete the split before starting that. We needed to limit our scope to shorten the time it would take before we were free from the old codebase and the restrictions of working on it with the other team.
Slicing the Elephant
Before we started, we knew that we wanted to split the codebase by duplicating the repo into a new GitHub repo and deploying the new duplicated codebase to prod while remaining harmless. This is known in Industrial Logic’s elearning Refactoring album as the “Parallel Change” refactoring strategy that states “Parallel an old way of doing something with a new way, then safely cut over from the old to the new, then remove the old way”.
We wanted to keep the automated end-to-end tests the same. However, we didn’t want it to cause conflicts with the existing app hitting the real database, so we did a spike in a branch that confirmed we could deploy it so the database calls were temporarily pointed to an internal embedded database, that way the endpoints in prod would do no harm.
There were multiple microservices deployed as multiple apps, each with their own GitHub repo.
So we checked our backlog of stories to see what features were coming up next, and grouped them by codebases where the changes needed to happen. We prioritized the work so we were not working on the first codebase we wanted to split.
The next step was duplicating the codebase in GitHub so that we kept all the commit history from the old codebase. Then we changed it to use the embedded database and created a simplified Jenkins CI/CD pipeline that ran the tests and deployed our app into the environments, and into prod.
After that, we went through and deleted all the endpoints that we were no longer keeping along with their end-to-end tests and deployed that to prod.
Next, we implemented the real database in the new codebase and switched to calling the new endpoints with the client. This turned out to be a fairly small step, and the beauty of it was we didn’t have to switch every endpoint at the same time. We switched over the endpoints that were needed for the next feature story first.
Then we transitioned all the rest of the endpoints, then rinse and repeat for the other services.
This allowed us to mostly achieve our goals:
- Each slice went to prod, allowing us to deal with the problems involved with deploying an application in the infrastructure
- Each slice did not break production
- No duplication of features was required
- We rearranged the priority of features to look like we did not stop implementing them. The time each application was on a code freeze was short enough to not be painful.
So we learned that slicing big problems into smaller ones can be done in a thoughtful way so as to achieve goals of each slice adding value, being testable, doing no harm, being deployed all the way to prod, and bringing us closer to the goal. There are multiple ways to apply these to any large problem depending on the context.