This week we look at a simple refactoring rule called the “Rule of Three”. We’ll also see how it relates to some other well-known programming rules.
Things structured in threes is a very common theme. There are children’s stories like the Three Little Pigs, Three Billy Goats Gruff, and Goldilocks and the Three Bears. The Rice Krispies marketing slogan is “Snap, Crackle and Pop”. There were the three wise monkeys “See no evil, Hear no evil and Speak no evil”. We’ve all heard the expression “Early to bed and early to rise makes a man healthy, wealthy and wise”. The Olympic motto is “Faster, Higher, Stronger”. When writing a story, we must have a beginning, a middle and an end. Three is an important number in both religion and numerology.
There are a lot more rule-of-three lists out there. See the Wikipedia Rule of Three disambiguation page for an interesting selection.
Refactoring And The Rule of Three
Refactoring is a disciplined way of restructuring existing code by changing its internal structure without changing its external behaviour. It helps us to redesign and rewrite our code to make it better, simpler and easier to maintain.
The “Rule of Three” is a simple code refactoring rule of thumb. It can help us decide when we should refactor similar pieces of code. The rule states that we don’t need to worry about refactoring if we only have two pieces of similar code. When we have three or more pieces, then we should refactor the code. When there are only two copies, refactoring may or may not be worth the cost and effort. When there are three copies, the cost of refactoring will be less than the cost of maintenance.
The rule was popularised by Martin Fowler in his book “Refactoring: Improving the Design of Existing Code”. He attributed the rule to Don Roberts, one of the book’s contributors.
A Refactoring Example
Let’s say we have a client who wants us to develop a software system to keep track of all the programmers in their company. The first class we’ll identify is the Programmer
class. We’ll start designing the system with that class in mind. However, our client soon changes the scope of the project to include the secretaries. No problem, we can copy and paste the code from the Programmer
class to a new Secretary
class, make a few small changes and we’re back on track.
It’s obvious that there are a lot of similarities between these two classes. A number of attributes and behaviours are shared. Both the Programmer
and the Secretary
have first names, surnames, dates of birth, employee numbers, genders, salaries, etc. They both fill out time sheets, drink coffee, apply for leave. But there are behavioural differences: a programmer programs, while a secretary types and does administrative tasks.
Should we refactor our code at this stage? We obviously want to avoid code duplication.
Code Duplication
When programming, code duplication leads to many problems. It makes the code harder to maintain. When the logic in a duplicated piece of code changes, it needs to be changed everywhere it’s used. If we miss just one spot, there can be serious consequences.
However, it’s difficult to decide on the most appropriate design if we only have one or two classes (such as the Programmer
and the Secretary
). Refactoring too early increases the risk of choosing the wrong design abstractions. This could result in even worse code when new requirements are added to the project. This will eventually force us to refactor the code yet again.
Adding Another Class
Imagine that our client extends the scope of the system with an extra requirement. We now need to model and track managers. That should be easy. Do what we did before: copy, paste and change some code, and we have a new Manager
class.
But wait, now we have three very similar classes. This potentially adds a lot of maintenance issues. Now would be a good idea to revisit our design.
We’d like to extract the common elements from our three classes, and move them into a common base class from which all three can inherit. Thinking about what a programmer, a secretary and a manager actually are, we realise that all three are employees. Thus we can create an Employee
class which contains the common elements. This will become the base class (superclass) for the three original classes.
Our class design now looks better. We’ve extracted the common attributes and behaviours, and localised them in one base class. This is much better from the perspective of code maintenance and quality. The only coding we have to do in the Manager
, Programmer
and Secretary
classes is what distinguishes them from each other. These will be the separate manage()
, program()
and type()
methods.
DRY vs WET
“Don’t Repeat Yourself” (DRY) is a principle of software development aimed at reducing code duplication. We replace any duplicated code that is likely to change with an abstraction that is less likely to change. We can use class inheritance which avoids the code duplication in the first place.
The opposite of DRY is WET. This acronym can stand for either we enjoy typing, write everything twice or waste everyone’s time. We often find WET solutions in multi-tiered/web applications. A programmer might have duplicated data validation code in a number of places: the HTML page, the business logic, the database queries, etc. A DRY approach tries to reduce or remove the repeated code by using a better design.
AHA and YAGNI
Another approach to class design is the AHA principle. AHA stands for “Avoid Hasty Abstractions”. This was described by Kent C. Dodds as optimizing for change first, and avoiding premature optimization. This was influenced by Sandi Metz’s idea of “prefer duplication over the wrong abstraction”.
Instead of starting with an abstraction, or abstracting (redesigning) when we reach a specific number of duplications, we can create more flexible, robust software if abstraction is done only when it is needed, and we know how the abstractions need to work.
AHA and YAGNI are very similar. YAGNI stands for “You ain’t gonna need it”. This principle came from Extreme Programming (XP). It says that a programmer shouldn’t add functionality until it’s really necessary. Ron Jeffries, a co-founder of XP, explained it as “Always implement things when you actually need them, never when you just foresee that you will need them.”
Conclusion
Sorry about all the acronyms! Computer science and technology are full of them. Where half of them come from, is anyone’s guess…
Hopefully all these principles have given you something to think about. They do have the potential to improve your design skills.
For more on refactoring, see Martin Fowler’s website https://www.refactoring.com. For another perspective on the Rule of Three, see the Coding Horror page.
Was this post useful? Let us know in the comments, and as always, stay safe and keep learning!
1 thought on “Refactoring and The Rule of Three”
Very interesting post.
We were taught not to duplicate at all, but it makes sense that if the code is not going to change (much) then why bother abstracting.
Abstraction does bring about a bit of complexity for maintenance, and of course the extra time required to do it.
I feel guilty “copying-and-pasting” code, so I almost always end up abstracting. Now I won’t feel so guilty.
This was more of a psychological post, thanks.