From the previous post on decompiling, we saw how easy it was to recover source code from a compiled Java class file. There’s a lot of hard work and intellectual property embedded in those compiled files. What happens if someone gets access to your class files and decompiles everything? Your intellectual property is there for the taking!
Can we protect our class files? Fortunately we can. With the help of an obfuscator, we can make the decompiled files much more difficult to understand and modify. This can prevent reverse engineering and unauthorized access. Obfuscation make it harder to access intellectual property or bypass licensing restrictions.
What Does An Obfuscator Do?
An obfuscator can do any or all of the following:
- Change the names of classes, fields and methods to seemingly random names. We know that good names help document code by making it easier to understand; meaningless names do the opposite.
- Use heavy overloading of method names. Imagine trying to understand a decompiled class with a dozen
foo()
methods with different sets of parameters, some of which are dummy (unused) parameters? - Replace identifiers with Java keywords. This makes it extremely difficult to understand decompiled Java classes. Obfuscators can do this because they work on the compiled classes, while Java keywords are only meaningful to compilers.
- Strip all unnecessary information like line numbers, local variable names and source file names from the class files.
- Replace statements, loops, literal values and strings with more complicated roundabout expressions to deliberately hide what the code is doing.
A beneficial side effect of obfuscation is that the class size is often reduced, which leads to faster downloads and reduces application startup time.
How Does An Obfuscator Work?
To determine which classes can be obfuscated, most obfuscators start at a single entry point (usually the main()
method of an application), and construct a tree of all classes accessible from that point. Unfortunately, this only works for simple applications, and breaks down if the Java code has multiple entry points, such as library code.
Most obfuscators work on all the class files packaged in a JAR
file. They can reduce the size of these files if the application does not use all of the contained bytecode. The obfuscator analyses the bytecode of the input JAR
files to determine which code cannot be reached. This unused code (either entire classes or single methods and fields) will then be removed. This is common when using third party libraries because usually we do not use all of the library functionality. A large proportion of the bytecodes and/or resources can then be safely removed.
Which Obfuscator Should I Use?
There are a variety of open-source and commercial obfuscators available. A list of commonly used ones follows:
- ProGuard is a popular open-source GPL-licenced bytecode optimizer and file shrinker for Java and Kotlin. It claims to make these applications up to 90% smaller and 20% faster. It also provides some minimal obfuscation by renaming classes, fields and methods. Android Studio uses ProGuard automatically.
- yGuard is another commonly used open-source obfuscator.
- Zelix KlassMaster is a full featured commercial Java obfuscator. It shrinks and obfuscates both code and string constants.
- Allatori is a commercial second generation Java obfuscator.
- DashO Java and Android Obfuscator is a commercial second generation Java obfuscator.
- Stringer Java Obfuscation Toolkit is a commercial Java obfuscator supporting up to Java 13.
There is an easy-to-read introductory article with extra links on bytecode obfuscation on the OWASP Foundation’s website. Another good introductory article on obfuscation techniques is on the DashO website.
Should I Use An Obfuscator?
With enough time and effort, almost all code can be reverse engineered. Obfuscators can make reverse-engineering more difficult and economically impractical.
Developers and managers often exaggerate the risk of someone seeing the source code. While good decompilers can produce good source code, it’s not trivial to analyse it. The costs and legal risks associated with reverse engineering are high enough to discourage decompiling for commercial gain. Even with full source code, it’s generally hard enough to understand what the intention of the coder was. Obfuscation is just the icing on top of the cake.
Obfuscation does protect your code from casual attackers. However, its main benefit is minimizing the size of the application by removing unused code and shortening all identifiers to 1 or 2 characters.
The current thinking is to protect sensitive String
data rather than proprietary algorithms. These String
s can contain logins, passwords, licensing code, API credentials, links to non-public resources, text for GUI elements, error messages, etc. String
usage is a large percentage of our applications.
The downside of String
encryption is that it does affect application performance because the String
s must be decrypted at runtime before they can be used. The performance could degrade, possibly in the order of 10%, depending on the application.
The future of obfuscation will probably be in String
encryption tools. Currently String
encryption is available only in commercial obfuscators, such as Allatori, Zelix KlassMaster, Stringer Java Obfuscation Toolkit and DashO.
Conclusion
It really boils down to how your Java code is distributed and used, and who your clients are. Obfuscation adds a layer of complexity to your build process. If it isn’t set up correctly, it can cause runtime errors. If someone has access to the JAR
files on your internal servers and has the knowledge to be able to sniff around inside these servers, then there are far more dangerous things they can do than steal your source code. Your entire operation is then in jeopardy.
The conclusion: Be aware of the occasional need for obfuscation. String encryption is probably the best option. Using and publishing open source code reduces the need to hide any intellectual property.
I’m always interested in your opinion, so please leave a comment. Your feedback helps me write tips that help you.