Last week’s post looked at why many computer languages index arrays starting from zero. In it, I also mentioned the concept of an off-by-one error. This is a common error when we use a number either smaller by one or larger by one than the number we should be using. A lot of off-by-one errors are caused by confusion about zero-based array indexing.
Off-by-one Errors in Loops
A common off-by-one error occurs when we’re programming loops, and the loop iterates once too many or too few times . We’ve probably all made this error at least once while we were learning Java (or whatever language we started using). Either we’ve started the loop at 1 instead of 0 or vice-versa, or we’ve used <= N
instead of < N
or vice-versa as the end condition.
The following example is fairly obvious:
// create an array with 10 elements
int arr[] = new int[10];
// 1. correct iteration - loops 10 times
// starting and ending points are correct
for (int i=0; i<arr.length; ++i)
// do something with each array element
// 2. incorrect iteration - loops only 9 times
// starting point is out-by-one
for (int i=1; i<arr.length; ++i)
// do something with each array element
// 3. incorrect iteration - loops 10 times and crashes
// ending point is out-by-one
for (int i=0; i<=arr.length; ++i)
// do something with each array element
// 4. incorrect iteration - loops 9 times and crashes
// both starting and ending points are out-by-one
for (int i=1; i<=arr.length; ++i)
// do something with each array element
Fortunately with Java, we have runtime array subscript bounds checking, so we can’t overwrite any memory that hasn’t been allocated to the array. The JVM will throw an ArrrayIndexOutOfBoundsException
, which makes it very easy to find our error.
Debugging this error is much more difficult in C and C++. Array subscripts are translated to pointer expressions, and we can merrily overwrite any amount of memory whether it’s been allocated to us or not. These can be a nightmare to find and fix!
Array-related confusion can also result from differences in programming languages. We use zero-based arrays in Java, but some languages start array indexing from 1. Pascal has arrays with user-defined indices. This makes it possible to use array indices that model the problem domain better.
Buffers and Strings
As Java programmers, we’re lucky to have a proper String
class. This means we don’t have to worry about allocating the correct amount of memory space for our string objects. However, if we’re programming in C, we have to remember that strings are just arrays of char
s terminated with a NULL
character (\0
). We must remember to always allocate one extra char
to hold the NULL
character. If we forget, we’ll almost certainly get an off-by-one error, and trash some memory. Most of the string functions defined in string.h
look for the terminating NULL
, and if it’s not there, good luck with finding the errors!
Fence-Post Errors
Fence-post errors are related to off-by-one errors. They’re actually a special case of off-by-one errors. They are sometimes called lamp post errors. They refer to how we deal with the initial or boundary conditions of a problem involving whole (discrete) values.
Fence-post errors happen when we count items instead of the spaces between them, or vice versa, or by forgetting to consider whether we should count one or both ends of a row.
Think about the following problem: “If we build a straight fence 10 metres long with posts 1 metre apart, how many posts do we need?” The answer, of course, is 11, not 10. The fence has 10 intervals between posts, and we need an extra post to start (or end) the first interval. Interestingly, either 9 or 11 is a better answer than the obvious 10.
Another example: let’s say we have a large array of items, and we want to process items M
through N
. How many items would we need to process? Our initial answer will probably be N - M
, but this gives an off-by-one error. The right answer is N - M + 1
. A program that used the “obvious” first answer would have a fence-post error.
Not all off-by-one errors are fence-post errors. The game of Musical Chairs creates a series of (semi-)disastrous off-by-one errors where N
people try to sit on N - 1
chairs. This isn’t a fence-post error, but an interesting illustration.
P.S. Any of the following spellings are correct: fence post (two words), fencepost (one word), fence-post (hyphenated).
Ending Off
Wikipedia has an interesting page on off-by-one errors.
There are other good pages on fence-post errors here and here.
Was this interesting? Please share your comments on the blog post, and as always, stay safe and keep learning!