Tuesday, May 06, 2008

java: Arrays vs ArrayLists vs Vectors vs LinkedLists

The question is which one to use and when? Which one is more efficient? Lets look into each one of them.

An array is basically a fixed size collection of elements. The bad point about an array is that it is not resizable. But its constant size provides efficiency. So arrays are better to use when you know the number of elements available with you.

ArrayList is another collection where the number of elements is resizable. So if you are not sure about the number of elements in the collection use an ArrayList. But there are certain facts to be considered while using ArrayLists.

=> ArrayLists is not synchronized. So if there are multiple threads accessing and modifying the list, then synchronization might be required to be handled externally.
=> ArrayList is internally implemented as an array. So whenever a new element is added an array of n+1 elements is created and then all the n elements are copied from the old array to the new array and then the new element is inserted in the new array.
=> Adding n elements requires O(n) time.
=> The isEmpty, size, iterator, set, get and listIterator operations require the same amount of time, independently of element you access.
=> Only Objects can be added to an ArrayList
=> Permits null elements

If you need to add a large number of elements to an ArrayList, you can use the ensureCapacity(int minCapacity) operation to ensure that the ArrayList has that required capacity. This will ensure that the Array is copied only once when all the elements are added and increase the performance of addition of elements to an ArrayList. Also inserting an element in the middle of say 1000 elements would require you to move 500 elements up or down and then add the element in the middle.

The benefit of using ArrayList is that accessing random elements is cheap and is not affected by the number of elemets in the ArrayList. But addition of elements to the head of tail or in the middle is costly.

Vector is similar to ArrayList with the difference that it is synchronized. It offers some other benefits like it has an initial capacity and an incremental capacity. So if your vector has a capacity of 10 and incremental capacity of 10, then when you are adding the 11th element a new Vector would be created with 20 elements and the 11 elements would be copied to the new Vector. So addition of 12th to 20th elements would not require creation of new vector.

By default, when a vector needs to grow the size of its internal data structure to hold more elements, the size of internal data structure is doubled, whereas for ArrayList the size is increased by only 50%. So ArrayList is more conservative in terms of space.

LinkedList is much more flexible and lets you insert, add and remove elements from both sides of your collection - it can be used as queue and even double-ended queue! Internally a LinkedList does not use arrays. LinkedList is a sequence of nodes, which are double linked. Each node contains header, where actually objects are stored, and two links or pointers to next or previous node. A LinkedList looks like a chain, consisting of people who hold each other's hand. You can insert people or node into that chain or remove. Linked lists permit node insert/remove operation at any point in the list in constant time.

So inserting elements in linked list (whether at head or at tail or in the middle) is not expensive. Also when you retrieve elements from the head it is cheap. But when you want to randomly access the elements of the linked list or access the elements at the tail of the list then the operations are heavy. Cause, for accessing the n+1 th element, you will need to parse through the first n elements to reach the n+1th element.

Also linked list is not synchronized. So multiple threads modifying and reading the list would need to be synchronized externally.

So the choice of which class to use for creating lists depends on the requirements. ArrayList or Vector( if you need synchronization ) could be used when you need to add elements at the end of the list and access elements randomly - more access operations than add operations. Whereas a LinkedList should be used when you need to do a lot of add/delete (elements) operations from the head or the middle of the list and your access operations are comparatively less.

4 comments:

Anonymous said...

This had just the information I was looking for. Thanks. ^_^

Gaurav said...

nice article..very useful..

Jen said...

Thanks! It's people like you that are passionate about their subject, that help people like me who are too lazy to read through pages and pages of text, when we just need one simple question practically answered!

Topman said...

great explanation, could help me secure a job ;-)