The Performance of Java's Lists
The LinkedList implementation
LinkedList is implemented using a list of
doubly-linked nodes. To access elements by index you need to traverse
all the nodes until the indexed node is reached:
public Object get(int index) {
//check the index is valid first .. code not shown here
Entry e = header; //starting node
//go forwards or backwards depending on which is closer
if (index < size/2) {
for (int i = 0; i <= index; i++)
e = e.next;
} else {
for (int i = size; i > index; i--)
e = e.previous;
}
return e;
}
Inserting elements into the list is straightforward: traverse to
the node at the index and insert a node immediately before that
node:
public void add(int index, Object element) {
//check the index is valid first .. code not shown here
Entry e = header; //starting node
//go forwards or backwards depending on which is closer
if (index < size/2) {
for (int i = 0; i <= index; i++)
e = e.next;
} else {
for (int i = size; i > index; i--)
e = e.previous;
}
Entry newEntry = new Entry(element, e, e.previous);
newEntry.previous.next = newEntry;
newEntry.next.previous = newEntry;
size++;
}
The LinkedList implementation has a performance
overhead for indexed access and update of elements, since access to
any index requires you to traverse multiple nodes. Adding elements to
the collection suffers from the index traversal access performance
drawback and, in addition, has a further overhead in requiring the
creation of a node object. On the plus side, there is no further
overhead to insertions and deletions, so insertions-deletion overhead
is really mostly dependent on how far away the insertion-deletion
index is from the ends of the collection.
Performance tests
There are many different functions of the classes that could be
tested. LinkedLists are frequently used because of their
supposed better performance for random index insertion and deletion,
so I decided to focus on insertion performance, i.e. building
collections. I've tested LinkedList against
ArrayList since both are unsynchronized (see the
"Thread-safe LinkedLists and other collections"
sidebar).
The insertion speed is critically dependent on the size of the
collection and the position where the element is to be inserted. All
the best and worst performance cases arise when inserting either at
one of the ends or at the exact middle point of the
collection. Consequently, I've chosen three insertion locations
(start, middle, end of the collection), and three representative
collection sizes of medium (100 elements), large (10,000 elements),
and very large (1,000,000 elements).
For my tests I used the Sun JVMs from the Java SDK 1.2.0 and 1.3.0
release. In addition I also tested with HotSpot JVM 2.0, the version
available with the 1.3.0 SDK. All measured times in each table have
been normalized to one of the SDK 1.2 VM tests (the cell showing 100%
in each table). The default JVM configuraton with JIT enabled was
used, though heap space needed to be increased for all JVMs to avoid
out-of-memory errors. Times recorded were the averages over several
test runs. In order to avoid garbage collection interference, I forced
a full scavenge of all memory between each test (see the source code
for details). The disk was monitored to ensure that disk paging did
not occur (any test runs which showed significant paging were
discarded). Each test that was too slow to give a multiple second
response time was looped repeatedly until a significantly long enough
time was recorded.
Table 1: Building a medium sized collection (100 elements). Figures in brackets are for pre-sized collections |
|
1.2 JVM |
1.3 JVM |
HotSpot 2.0 JVM |
Always inserting at the start of the ArrayList |
100% (48.0%) |
184.9% (152.0%) |
108.0% (66.7%) |
Always inserting at the start of the LinkedList |
135.5% |
109.1% |
85.3% |
Always inserting at the mid-point of the ArrayList |
130.0% (40.6%) |
187.4% (158.0%) |
84.7% (46.0%) |
Always inserting at the mid-point of the LinkedList |
174.0% |
135.0% |
102.3% |
Always inserting at the end of the ArrayList |
63.3% (20.7%) |
65.9% (25.0%) |
60.3% (29.3%) |
Always inserting at the end of the LinkedList |
106.7% |
86.3% |
80.3% |
For short collections, ArrayList and
LinkedList are pretty close in
performance. ArrayLists have the edge when inserting to
the end of the collection, i.e. appending elements. But then appending
elements is the single operation that ArrayList is most
optimized for: if you just want a statically sized collection, a Java
array (e.g. Object[]) gives better performance than any
collection object. Beyond the append operation, measured timings are
very mixed and reflect various JVM optimizations more than anything
else.
For example, for insertions to the start of the collection (first
two rows of table 1), the HotSpot 2.0 JVM with
LinkedLists giving the fastest time (85.3%), and the
second fastest time is the 1.2 JVM with ArrayList
(100%). These two results illustrate that the simple JIT compiler in
1.2 is very efficient at doing simple things like iterating over
and copying arrays. The more sophisticated JVM with optimizing
compiler in HotSpot is able to improve more complex activities such as
object creation (of LinkedList nodes) and take advantage
of code-inlining. The 1.3 JVM results seem to indicate a severe
underperformance in simple operations which is likely to be improved
in later JVM versions.
What I have only partially measured here are other advantages that
ArrayList has over LinkedList, namely the
ability to pre-size collections, and the reduced garbage collection
overhead of ArrayLists. Specifically,
ArrayLists can be created with a particular size (e.g. in
this test the ArrayList could be created with a capacity
of 100 elements), thus avoiding all the growth overheads. The figures
in brackets in table 1 show the improvements possible with pre-sizing
collections. LinkedLists (up to SDK 1.3) cannot be
pre-sized.