You may have heard of static code analysis, and even have it in your build files. But what is it, and why is it important?
What Is Static Code Analysis?
You know about dynamic testing where a tester physically runs your code to test the outputs against predefined inputs, or when a battery of automated unit and system tests is run against it. Obviously both types of testing generate reports of the correctness (or otherwise) of your code.
Static code analysers takes a different approach. The name explains it all: these are tools that analyse your code without running it, i.e. statically. You can think of them as a peer reviewer on steroids, where the reviewer is extremely knowledgable about the language and APIs, brilliant at identifying small problems, never misses a thing and stays focused during the entire review. And does it over and over consistently without getting irritated, tired or frustrated.
Both static and dynamic testing should be done to maximise the quality and correctness of the code, and reduce the number of problems a user will experience when using it.
Unfortunately, many projects still don’t use static analysis tools for a variety of reasons: lack of knowledge about them, too many errors are flagged, slows down the build process etc. These are generally poor excuses: improving code quality, reducing technical debt and delivering an error-free product are at the heart of software development.
What Does Static Code Analysis Do?
Depending on which static code analyser you use, it can do many or all of the following:
- Flag programming and style errors.
- Find bugs and suspicious constructs.
- Find unused and uninitialised variables.
- Suggest better API usage, including suggesting faster methods.
- Enforce coding standards.
- Find buffer overflows.
- Find casting and conversion problems.
- Flag ignored return values.
- Detect potential memory leaks.
- Command injection
- Detect copy-and-paste problems.
- Detect concurrency issues.
- Detect unused parameters/local variables.
And the list goes on and on, with hundreds more.
One of the original static code analysers was lint. This was a Unix utility that examined C language source code. The word lint comes from the the tiny bits of fibre and fluff that get released from clothes when they are being tumbled around in a tumble dryer. In the same way the lint tool finds the fluff in your code. Often small coding errors can have large effects. By getting rid of these small errors, you can improve the quality of your code.
The Java compiler javac
has a lint-like capability. When compiling Java code, you can pass the -Xlint
flag to the compiler. It then flags lots of smaller errors that would otherwise escape your notice
Most of the static code analysers read the source code, while some analyse the compiled object code (.class
files in the case of Java).
Which Static Code Analysis Tool Should I use?
There are hundreds of great tools to choose from, many being free and/or open-source.
The absolute best site to start your search for an appropriate static code analyser is https://analysis-tools.dev/. Currently they list and compare 541 tools for 67 programming languages and 51 markup languages. These include lint tools (“linters”), code formatters, style checkers, security checkers, metadata checkers, etc.
Static Code Analysis: An Example
Here’s a simple class which populates and iterates through six different concrete Collection
classes, timing each run.
import java.util.*; public class CollectionTester { public static void main(String[] args) throws Exception { System.out.println("====== Collections ======"); final String fmt = "%-25s took %4d milliseconds to %s.%n"; final int MAX = 1_000_000; final String[] cols = { "java.util.Vector", "java.util.ArrayList", "java.util.LinkedList", "java.util.TreeSet", "java.util.HashSet", "java.util.LinkedHashSet" }; long start, end; Integer integer = null; Random randomGenerator = new Random(); // forcing garbage collection to start with a clear slate System.gc(); System.gc(); for(int i = 0; i c = (Collection)cl.newInstance(); start = System.currentTimeMillis(); for(int elem = 0; elem it = c.iterator(); start = System.currentTimeMillis(); while (it.hasNext()) { integer = it.next(); integer++; } end = System.currentTimeMillis(); System.out.printf(fmt+"%n", cols[i], end-start, "iterate"); System.gc(); } } }
Here’s the output of the javac
compiler when the -Xlint
flag is used when compiling the source code:
CollectionTester.java:21: warning: [rawtypes] found raw type: Class Class cl = Class.forName(cols[i]); ^ missing type arguments for generic class Class where T is a type-variable: T extends Object declared in class Class CollectionTester.java:23: warning: [deprecation] newInstance() in Class has been deprecated Collection c = (Collection)cl.newInstance(); ^ where T is a type-variable: T extends Object declared in class Class CollectionTester.java:27: warning: [deprecation] Integer(int) in Integer has been deprecated c.add(new Integer(randomGenerator.nextInt())); ^ 3 warnings
PMD is a free, open-source static code analyser. It is used as a component in a number of other open-source analysers. It can be run in standalone mode from the command line as follows:
D:\pmd-bin-6.31.0\bin>pmd -dir "d:\dev\collections" -format text -R rulesets/java/quickstart.xml -version 11 -language java
The output from PMD follows:
D:\dev\collections\CollectionTester.java:3: NoPackage: All classes, interfaces, enums and annotations must belong to a named package D:\dev\collections\CollectionTester.java:3: UseUtilityClass: All methods are static. Consider using a utility class instead. Alternatively, you could add a private constructor or make the class abstract to silence this warning. D:\dev\collections\CollectionTester.java:10: LocalVariableNamingConventions: The final local variable name 'MAX' doesn't match '[a-z][a-zA-Z0-9]*' D:\dev\collections\CollectionTester.java:20: OneDeclarationPerLine: Use one line for each declaration, it enhances code readability. D:\dev\collections\CollectionTester.java:21: UnusedLocalVariable: Avoid unused local variables such as 'integer'. D:\dev\collections\CollectionTester.java:25: DoNotCallGarbageCollectionExplicitly: Do not explicitly trigger a garbage collection. D:\dev\collections\CollectionTester.java:25: DoNotCallGarbageCollectionExplicitly: Do not explicitly trigger a garbage collection. D:\dev\collections\CollectionTester.java:27: ForLoopCanBeForeach: This for loop can be replaced by a foreach loop D:\dev\collections\CollectionTester.java:48: DoNotCallGarbageCollectionExplicitly: Do not explicitly trigger a garbage collection.
It’s fairly clear from the outputs that using both the -Xlint
compiler flag and a static code analyser will give you greater insight into your code. This will allow you to fix potential problem areas.
Try it yourself. Download a static code analyser and point it at your current project. You’ll be both surprised and horrified by the number of coding issues it will find.
Conclusion
You should definitely consider using a static code analyser (or a number of them) in your production build process. It may create a lot of extra work in the short term having to correct the issues, but will yield substantial long term benefits in code quality, stability and speed.
I’m always interested in your opinion, so please leave a comment. Your feedback helps me write tips that help you.