There’s a quote by Phil Karlton (accomplished software nerd, Netscape architect): “There are only two hard things in computer science: cache invalidation and naming things.”
Before we start developing code, we should have a good idea how to name things. It’s a lot less stressful if we don’t have to worry about naming conventions while we struggle to think of good names for our classes, variables and methods.
Identifier Revision
An identifier is a user-defined name that is given to a class, object, variable,constant, method or label. Identifiers must conform to the following rules:
-
Java is case-sensitive. This means that the identifier
surname
is not the same as the identifierSurname
, which is not the same asSURNAME
. -
The first character of an identifier must be a letter, and may not be a digit. The meaning of the term “letter” is much broader in Java than in most other languages: it is defined as
a
toz
,A
toZ
,_
,$
or any Unicode character that represents a letter in any language supported by the Unicode standard. -
Subsequent characters can be letters (as defined above) or digits. Digits are
0
to9
or any Unicode character that represents a digit in any language supported by the Unicode standard. -
Spaces and symbols like
+
or©
cannot be used inside identifiers. -
The identifier cannot be one of the Java reserved words, e.g.
if
,else
,false
,for
,public
,class
,extends
, etc. -
There is no restriction on the length of an identifier.
Naming Conventions
Good coding practices and conforming to standard naming conventions are an important part of writing robust and maintainable code.
The following naming conventions are generally used in Java programs:
-
Class, interface, enum and annotation names: Every word in the identifier is capitalised, including the first, e.g.
MyClass
,ArrayIndexOutOfBoundsException
,HelloWorld
. This is called UpperCamelCase. -
Variable and method names: Every word in the identifier is capitalised, excluding the first. This is called lowerCamelCase. Method names should be verbs or verb phrases that describe their usage, e.g.
getValue()
,isOpen()
. Variable names are typically nouns or noun phrases that describe their purpose, e.g.userName
,customerId
. -
Package names: These are written in lowercase, normally prefixed with the registered domain name of your organisation backwards, e.g.
com.incusdata.examples
. -
Constant names: All characters are written in uppercase, and individual words are separated with underscores, e.g.
VAT_RATE
,MAX_AGE
. This is sometimes called CONSTANT_CASE.
Always conform to the standard naming conventions!
Rules of Thumb When Choosing Names
We should aim for self-documenting code. We spend more time reading code than writing code, so trying to understand unfamiliar naming patterns and abbreviations makes it harder to debug unfamiliar code.
Abbreviations and bad names force us to spend time understanding the context first, before we even attempt to debug the code.
Some of the rules of thumb for naming identifiers are:
- Choose readable names that clearly describe the identifier’s purpose.
- Avoid using one-character names.
- Avoid names starting with an underscore character.
- Avoid using magic numbers.
- Never abbreviate names.
- Never prefix identifiers with types; avoid Hungarian notation.
- Add units to identifier names.
- Think twice about putting types in your types, i.e. don’t use the “I” prefix for interfaces and the “Base” prefix for base classes.
- Think twice about coding “util” classes and packages.
Examples
Let’s cover the first four or five rules of thumb.
Look at the following method. Have you any idea what the original programmer intended (other than to obfuscate the code for job security)?
double foo(double bar) {
return bar * 355.0 * bar / 113.0;
}
How about this method? It’s much clearer!
double getCircleArea(double radius) {
return Math.PI * radius * radius;
}
It’s the identical method using better, self-documenting names and no magic numbers.
Aside: 355.0/113.0
is equal to Math.PI
within 6 decimal points. Which is way more exact than the oft-repeated 22.0/7.0
from our school days! Just remember the number 113355
, split it in the middle and divide the second part by the first part.
What do you think is happening in the following code?
double a[][] = new double[5][12];
double t = 0.0;
// populate a...
for (int i=0; i<5; ++i) {
for (int j=0; i<12; ++j) {
t += a[i][j];
}
}
Lost? Obviously some nested iteration and some addition. Other than that, there is no context and no inherent meaning.
How about now?
double table[][] = new double[5][12];
double total = 0.0;
// populate table appropriately (not shown)
for (int row=0; row<5; ++row) {
for (int column=0; column<12; ++column) {
total += table[row][column];
}
}
Better! But it’s still not ideal. We still have no idea of the context of what is being done. What does table
refer to? The identifiers row
and column
are very general and only imply iteration through a two-dimensional table. And what do the numbers 5
and 12
mean?
And now?
final int YEARS = 5;
final int MONTHS = 12;
double monthlySales[][] = new double[YEARS][MONTHS];
double totalSales = 0.0;
// populate monthlySales table appropriately (not shown)
for (int year=0; year<YEARS; ++year) {
for (int month=0; month<MONTHS; ++month) {
totalSales += monthlySales[year][month];
}
}
Third time lucky! We have a much better idea of the context of the code. The code is self-documenting. We’ve got rid of the two magic numbers and replaced them with constants.
Conclusion
Next week we’ll look at the remaining rules, especially those dealing with units and types in names.
Do you use any other naming rules? Do you have identifier naming rules in your coding standards? Do you have coding standards? Please share your comments.
Until then, stay safe and write self-documenting code!
PS: There are a few variations of the quote at the beginning of the post. The one I like the most is “There are two hard problems in computer science: cache invalidation, naming things and off-by-1 errors.”