The Performance of Java's Lists
05/30/2001
The SDK has several implementations of the ordered collection
interface java.util.List. The three best known are
Vector, ArrayList, and LinkedList.
One question that frequently
reaches the Java
Performance Tuning site asks about the performance differences
between these List classes. In this article I'll take a
look at the performance differences between the
LinkedList implementation and the
Vector/ArrayList implementations.
To consider fully the performance ramifications of these classes,
we need to know how they are implemented. So I'll start with brief
descriptions of the most important implementation aspects from the
point of view of performance.
The Vector and ArrayList implementations
Vector and ArrayList are both implemented
with an underlying Object[] array that stores the
elements. Accessing elements by index is a simple matter of accessing
elements in the internal array by index:
public Object get(int index) {
//check the index is valid first .. code not shown here
return elementData[index];
}
The internal array can be bigger than the number of elements held
by the Vector/ArrayList object: the
difference is kept as extra capacity for efficiently adding
elements. Adding elements is then very simply achieved by assigning
the element to the first empty location in the internal array, and
incrementing the index (size) for the new empty location:
public boolean add(Object o) {
ensureCapacity(size + 1); //explained soon
elementData[size++] = o;
return true; //List.add(Object) signature support
}
Inserting elements into the collection at any location other than
the end is slightly more tricky: the array elements above the
insertion point must all be moved up by one, then the assignment can
occur:
public void add(int index, Object element) {
//check the index is valid first .. code not shown here
ensureCapacity(size+1); //explained soon
System.arraycopy(elementData, index, elementData, index + 1,
size - index);
elementData[index] = element;
size++;
}
When the spare capacity is used up, the
Vector/ArrayList object must replace its
internal Object[] array with a new larger array when more
elements need to be added, copying all the elements to the new
array. The new array is 50% to 100% bigger than the old one depending
on the SDK version (the code shown here makes it 100% bigger):
public void ensureCapacity(int minCapacity) {
int oldCapacity = elementData.length;
if (minCapacity > oldCapacity) {
Object oldData[] = elementData;
int newCapacity = Math.max(oldCapacity * 2, minCapacity);
elementData = new Object[newCapacity];
System.arraycopy(oldData, 0, elementData, 0, size);
}
}
The main difference between the Vector class and the
ArrayList class is the use of synchronization. Apart from
two methods used only during serialization, none of the
ArrayList methods are synchronized; in contrast most of
the Vector methods are synchronized directly or
indirectly. Consequently Vector is thread-safe, and
ArrayList isn't. This makes ArrayList
faster than Vector. For some of the latest JVMs the
difference in speed between the two classes is negligible: strictly
speaking, for these JVMs the difference in speed between the two
classes is less than the variation in times obtained from tests
comparing the performance of these classes.
The Vector and ArrayList implementations
have excellent performance for indexed access and update of elements,
since there is no overhead beyond range checking. Adding elements to,
or deleting elements from, the end of the list also gives excellent
performance except when the capacity is exhausted and the internal
array has to be expanded. Inserting and deleting elements always
require an array copy (two copies when the internal array must be
grown first). The number of elements to be copied is proportional to
[size-index], i.e. to the distance between the
insertion/deletion index and the last index in the collection. For
insertions, inserting to the front of the collection (index 0) gives
the worst performance, and inserting at the end of the collection
(after the last element) gives the best. The array copying overhead
grows significantly as the size of the collection increases because
the number of elements that need to be copied with each insertion
increases.