Use Data to Split a God Class
7 minute read
Don’t Make Me Touch Ordering!
The PM thought the feature was simple. And it should have been. Unfortunately, it requires making a tiny change to payment processing. And payment processing is part of the Order system…which is one monster class.
More than half of our code depends on Order to do something. Each part of that code depends on the Order for a different set of responsibilities. Although there are patterns, the Order class’s sheer size and number of responsibilities make these patterns hard to see.
Any change to Order could have unintended impacts on other code. Order is too big to understand in detail all at once. Some internal details appear to contradict each other, because they are used in different cases and have adapted to contextual business needs. So any understanding of it at a summary level will have errors, and those errors will lead any change to cause bugs.
Thus every feature that touches Order takes forever. We move slowly and carefully, we create and find and fix many bugs, we get it stable enough again, and we ship. And hope.
Order is a God Class. We need to split it up so that we can change it more quickly and safely.
A previous newsletter examined how to disentangle the God Class from the procedures that depend on it. Now we’re going to look inside the God Class. Our goal is to split it into several smaller classes, each of which manages a smaller set of responsibilities.
But the God Class is too complicated to understand fully.
- How can we break it up correctly if we can’t understand it?
- How can we identify the responsibilities for new classes?
- How can we afford to make these changes?
Organize your Warehouse
Assume we run a giant transfer warehouse that holds everything. One day we notice that orders tend to have the same kinds of goods. Sometimes people buy both electronics and groceries in the same order, but it is rare. We improve efficiency by sorting the warehouse, using each of the four corners to manage one category of goods. We know that some orders will cross corners, but most can be handled locally.
Customers and Suppliers (external code) still interact with the old warehouse (God Class). When each truck arrives we can direct it to the correct doors (Helper Methods on Implementation Classes) based on its content. Each corner (Implementation Class) then manages its orders. We just need to decide what items to put into what corners.
There are traditional segments for a warehouse - perishables are different than chemicals or electronics. There are good reasons for these categories, but some good fall into multiple segments - where do you put household cleaners? We can use data to optimize the warehouse better by storing “things that are often ordered together” near each other.
The same is true for our God Class. Programmers traditionally group methods into classes based on responsibility. However, we don’t know what the responsibilities are right now, and some methods will fall into multiple responsibilities. Just like the warehouse, we want to use metrics to cluster our methods.
In programs, methods that share responsibilities also share data. Therefore even if we don’t know what the responsibilities are, we can group methods “that use the same data.”
- The useful clustering measure in a warehouse is “goods that are ordered together.”
- The useful clustering measure in a God Class is “methods that share data.”
As such, we will find the methods that use the same fields.
Once we have those related methods, we can break the God Class into implementation classes, where each class has one set of shared fields and the methods that use them.
Access the recipe to use shared fields to break up a God Class, as well as other recipes coming in the future!
God Class Becomes a Shell Containing Useful Classes
Our God Class remains as a Facade between the external code and each of these implementaton classes. It’s not perfect, but it is better than the God Class we have today, and the first step towards eventual elimination.
Breaking out all the implementation classes for a large God Class could take weeks or months. However, it is easy for each person do to one increment. Once you get it started, each developer that touches the God Class will make it a little better.
Benefits:
- Reduce cost for each story that changes the God Class by refactoring the God Class as part of that story.
- Reduce the cost for all future stories that change the God Class.
- Improve validation by unit testing the code that is currently in your God Class.
- Distribute the cost to clean up the God Class among all teams and stories that it impacts.
Downsides:
- One intermediate step makes readability inside the God Class slightly worse.
- God Class remains as a Facade, rather than having the external code simply use the smaller classes directly.
Demo the value to team and management…
Show three things at your sprint demo:
- Example: One set of isolated methods and how it deccreased cost for a story this sprint.
- Progress: Number of LoC remaining in this God Class outside of implementation classes.
- Impact: Percentage of features that require changing this God Class
Example: Reduced story cost from one isolation
Your goal is to show that the old system required a lot of work to make a safe change and how you safely changed the new system with lower cost.
Firstly, show how verifying the old system requires a ton of test cases.
- Show one example - a place where you needed to alter a God Class method.
- Show how many different parts of the product use that behavior directly or indirectly, or some other God Class code that is a near duplicate.
- Then point out that testing this change requires that you manually inspect the intention for each caller to determine which ones want the new behavior and which want something else.
- Show your estimate for the number of errors (bugs) that would typically happen and the amount of time spent working carefully to prevent errors, looking for errors, and then fixing them.
Secondly, show the new code.
- Show how the new implementation class can be verified more easily with unit tests.
- Show how you can more easily choose different behaviors for different callers (by using different implementation classes or methods).
- Show places where you didn’t need to be as careful, and you were still able to avoid creating bugs.
- Show the actual amount of time you spent working on the story (including verification and bugfixes) + the amount of time you spent refactoring. Compare that to the estimated time you would have spent doing the story if you hadn’t refactored.
Impact: Decreased story cost
Step 1
First show the cost of a story that touches the God Class. Use historical data if possible.
- Person-hours to write.
- Person-hours to validate.
- Person-hours to fix bugs.
- Additional business impact of those bugs.
Step 2
Get historical data to show the percentage of features that require God Class changes. Multiply that by the cost above and the number of features per year to get an annualized cost to keep the God Class. This is the annual business cost to do nothing.
Step 3
Now show the cost of fixing the God Class:
- Cost to refactor out a helper method times the number of such methods to refactor out.
- Cost of additional verification for your extraction.
- Amount of delay or acceleration on the story that performs an extraction, calculated as per your example isolation.
- Business cost or value of that delay.
Step 4
Show the annualized cost of keeping the God Class vs the cost of fixing it. This could be a one-time display, or it could be a dashboard that updates as you factor code out of the God Class.