You may have heard some programmers say that code should be difficult to read because it is difficult to write.
That’s obviously a bad idea from the point of view of maintenance and modification. However, that should make us think about software complexity, and how to measure and reduce it.
What is Cyclomatic Complexity?
Cyclomatic Complexity (CC) is one of the metrics we can use to measure code complexity. It was introduced by Thomas McCabe in 1976, and indicates the complexity of a method with a single number. The original paper used the term “module”, but we can think in terms of a function or method in most languages.
McCabe’s formula is a measure of the number of independent execution paths between the input of a function and its output. The formula is based on graph theory, and determines the control flow graph of a method. If you’re interested in the maths, the original paper can be found here.
CC Value Calculation
The CC value is calculated by measuring the number of independent execution paths of a method. It’s an indication of how many conditional branches and loops there are in the method. For structured programming (with no goto
statements) the CC value is roughly equal to the number of loops and conditional statements plus one.
For example, the CC value of the following method is 3 (one loop, one condition, plus one).
public boolean containsDigit(String s) {
for (int i=0; i<s.length(); ++i) {
if (Character.isDigit(s.charAt(i))) {
return true;
}
}
return false;
}
The CC value is independent of the number of source code lines in the method. It doesn’t matter how many lines there are, or the code style we use. The CC value is an indication only of code complexity. It doesn’t measure the complexity of any of the data structures. It also can’t measure a complex object-oriented design where the code is split over several classes.
We will generally use a static code analysis tool to calculate cyclomatic complexity. Unfortunately McCabe’s original paper was vague on some of the calculation details. This means that different tools might give slightly different complexity values for the same piece of code.
CC Values vs Testing
The complexity level affects the testability of the code. The higher the CC value, the more difficult it will be to write relevant tests.
Static analysis tools commonly use the following values:
- 1 to 4: Low complexity. Easy to test.
- 5 to 7: Moderate complexity. Acceptable, but not ideal.
- 8 to 10: High complexity. Should be refactored to make testing easier.
- 11 and over: Very high complexity. Very difficult to test and maintain. Redesign and/or rewrite.
The cyclomatic complexity gives an upper bound for the number of test cases needed to get full branch coverage of the code. Cyclomatic complexity can be used to estimate the effort required for writing tests. The CC value indicates exactly how many test cases will be needed to achieve a 100% branch coverage score.
McCabe’s paper suggested that programmers limit the cyclomatic complexity of their modules to a maximum value of 10. This was recommended as “a reasonable, but not magical, upper limit”. A function with a CC value larger than 10 is difficult to maintain because it has too many branches, switch/cases or loops. If the complexity exceeds 10, we should probably refactor and/or redesign our code.
High complexity generally translates to low readability and high maintenance costs. However, complexity of the control flow graph is not necessarily what a person would perceive as complex. Large switch/case statements can easily exceed the CC limit of 10. But switch/case statements are generally very easy to read and understand.
Conclusion
Unfortunately for us programmers, there isn’t a single simple measurement that can express an abstract concept such as complexity as a single number.
This doesn’t mean that we should not measure and control complexity. It has to be done using multiple metrics that cover different aspects of complexity.
For example, measuring the length of a method is a very simple metric. Many developers might reject this as being too simple, but it’s obvious that longer methods are harder to understand and maintain. So the length of a method can be a useful complexity measurement.
Another metric we can look at is nesting depth. This is the number of control structures that are nested inside each other. Deeply nested statements increase complexity by making it harder for a developer to understand the code.
McCabe’s cyclomatic complexity is a simple metric that allows us to get reliable results at the function/method level. It allows us to easily identify methods with a lot of loops, conditional statements, and switch/case statements. It doesn’t identify other types of software complexity, such as design pattern implementations spread across a number of classes.
Have you ever used cyclomatic complexity metrics? What other metrics have you used? Let us know in the comments, and as always, stay safe and keep learning!