Choosing a Collection in Java

Choose a collection - image of coloured storage crates

We looked at Big O notation in the last post. It’s important to know about Big O notation when we’re working with the Collections API in Java. Let’s look at the Collections API in more detail in this post. In particular, I want to answer the question: How do we choose a concrete collection class to use in our Java code?

Choosing the Interface

Firstly, we need to decide whether we’ll be storing values or key:value pairs. If we want to store values only, then we’ll look at classes implementing the Collection interface. For key:value pairs, we’ll look at classes implementing the Map interface.

Next, we need to decide whether the collection must contain duplicates or not. If we need duplicates in our collection, we need to use a List or a Queue. If we cannot have duplicate elements, a Set or a Map will do.

Choosing the Implementation

Next, we need to think about the intended use of the collection. Do we need to access the elements sequentially or randomly? Is the collection going to stay the same size? Are we going to add and remove elements often? Is it going to grow and shrink? Must the collection be sorted? Do we need a FIFO or a LIFO queue? Answering these questions will help us decide on a suitable concrete implementation.

The following table summarises some of the more useful concrete classes in the Collections framework:

Interface Resizeable Array Linked List Hash Table Binary Tree Hash Table with Linked List
Set HashSet TreeSet LinkedHashSet
List ArrayList

Vector

Stack
LinkedList
Map HashMap
Hashtable
TreeMap LinkedHashMap
Queue PriorityQueue LinkedList

The collection names are easy to understand: the first word(s) in the class name represents the backing storage mechanism (either an array, a linked list, a binary tree or a hash table, or combination of two backing stores), and the last word represents the collection interface type (list, set, map or queue). Detailed explanations follow:

  • If the size of the collection is fairly constant and we need to access the elements by index as if it were an array (random access), an ArrayList is the best choice.

  • If we need to add or remove elements more often than we access them, a LinkedList would be a better choice than an ArrayList. Take note that even though there are methods in the LinkedList class to access elements by index, each access starts from the beginning (head) of the LinkedList and is extremely slow for large lists.

  • If we need our collection to be sorted, we can use a binary tree as the sorting mechanism and choose either a TreeSet or a TreeMap depending on whether we just have values or key:value pairs.

  • If we need very fast element access and insertion/removal from the collection and we are not concerned about the order of the elements, we can use either a HashSet or a HashMap (again depending on whether we need values only or key:value pairs).

  • If we need the speed of a hash table, but don’t want to lose the order that elements have been inserted, we can use either a LinkedHashSet or a LinkedHashMap (a hash table backed by a linked list that stores the order of insertion).

  • If we need a FIFO (first-in, first-out) queue, we can use a LinkedList (which implements the Queue interface), or one of the appropriate classes in the java.util.concurrent package.

  • If we need a LIFO (last-in, first-out) queue, we can use a Stack (which implements the List interface).

The legacy collection classes — Vector, Stack and Hashtable — have been retrofitted to implement the appropriate interfaces, and can be used in new applications. The Vector class is much the same as the ArrayList, and the Hashtable is roughly equivalent to the HashMap class. The Stack class inherits from Vector and adds peek(), push() and pop() methods. A better implementation of a LIFO stack is provided by the Deque interface and its implementations, and should be used in preference to Stack.

These legacy classes are synchronized, which means they are safe for multi-threaded applications, but are slower. If thread-safe collections are not needed, we should rather use the newer unsynchronized classes, which are generally faster than the synchronized legacy classes.

There are many more specialised collections available, such as WeakHashMap, EnumMap, EnumSet and IdentityHashMap to name a few. There are also many specialised Queues, Maps and Lists available in the java.util.concurrent package.

Performance and Big O Notation

The following table gives an indication of how fast various collections are as far as accessing, adding and deleting elements are concerned. The Java API documentation provides more specific details.

The table below uses Big O notation to describe the upper bound/worst-case performance for various operations for the most common collection classes.

Remember that Big O notation is an indication of the complexity of an algorithm. It is described in terms of the number of elements on which the algorithm operates. The O stands for “order of”. It indicates the order of the number of steps needed to complete a particular operation. The number within the parentheses indicates the order where n is the number of elements in the collection. O(1) is fast — i.e. “one” step (i.e., constant performance) no matter how many elements there are in the collection. On the other end of the spectrum is O(n) which is slow — i.e. “n” steps (i.e., linear performance), so the larger the collection is, the slower the operation becomes.

Type Accessing elements Adding/deleting elements Description
Resizeable array O(1) O(n) Very fast on access, slow when adding or removing elements.
Linked list O(n) O(1) Very fast when adding or removing elements, slow on access.
Hash table O(1) O(1) Very fast, but loses ordering of element insertion (chaotic).
Binary tree O(log n) O(log n) Slower than hash table, but sorted.
Queue O(1) O(log n) Generally FIFO (first-in-first-out) queue.
Stack O(n) for pop() O(n) for push() LIFO (last-in-first-out) queue.

For more detailed coverage on Big O notation, there are a number of cheat sheets on the web. Good ones are here, here and here.

Stay safe, and I’ll see you next week!

Leave a Comment

Your email address will not be published. Required fields are marked *

Code like a Java Guru!

Thank You

We're Excited!

Thank you for completing the form. We're excited that you have chosen to contact us about training. We will process the information as soon as we can, and we will do our best to contact you within 1 working day. (Please note that our offices are closed over weekends and public holidays.)

Don't Worry

Our privacy policy ensures your data is safe: Incus Data does not sell or otherwise distribute email addresses. We will not divulge your personal information to anyone unless specifically authorised by you.

If you need any further information, please contact us on tel: (27) 12-666-2020 or email info@incusdata.com

How can we help you?

Let us contact you about your training requirements. Just fill in a few details, and we’ll get right back to you.

Your Java tip is on its way!

Check that incusdata.com is an approved sender, so that your Java tips don’t land up in the spam folder.

Our privacy policy means your data is safe. You can unsubscribe from these tips at any time.