Why is a LinkedList Generally Slower than a List?

JP Richardson picture JP Richardson · May 12, 2011 · Viewed 15.3k times · Source

I started using some LinkedList’s instead of Lists in some of my C# algorithms hoping to speed them up. However, I noticed that they just felt slower. Like any good developer, I figured that I should do due diligence and verify my feelings. So I decided to benchmark some simple loops.

I thought that populating the collections with some random integers should be sufficient. I ran this code in Debug mode to avoid any compiler optimizations. Here is the code that I used:

var rand = new Random(Environment.TickCount);
var ll = new LinkedList<int>();
var list = new List<int>();
int count = 20000000;

BenchmarkTimer.Start("Linked List Insert");
for (int x = 0; x < count; ++x)
  ll.AddFirst(rand.Next(int.MaxValue));
BenchmarkTimer.StopAndOutput();

BenchmarkTimer.Start("List Insert");
for (int x = 0; x < count; ++x)
  list.Add(rand.Next(int.MaxValue));
BenchmarkTimer.StopAndOutput();

int y = 0;
BenchmarkTimer.Start("Linked List Iterate");
foreach (var i in ll)
  ++y; //some atomic operation;
BenchmarkTimer.StopAndOutput();

int z = 0;
BenchmarkTimer.Start("List Iterate");
foreach (var i in list)
  ++z; //some atomic operation;
BenchmarkTimer.StopAndOutput();

Here is output:

Linked List Insert: 8959.808 ms
List Insert: 845.856 ms
Linked List Iterate: 203.632 ms
List Iterate: 125.312 ms

This result baffled me. A Linked List insert should be O(1) whereas as List Insert is Θ(1), O(n) (because of copy) if it needs to be resized. Both list iterations should be O(1) because of the enumerator. I looked at the disassembled output and it doesn’t shed much light on the situation.

Anyone else have any thoughts on why this is? Did I miss something glaringly obvious?

Note: here is the source for the simple BenchmarkTimer class: http://procbits.com/2010/08/25/benchmarking-c-apps-algorithms/

Answer

Dan Tao picture Dan Tao · May 12, 2011

Update (in response to your comment): you're right, discussing big-O notation by itself is not exactly useful. I included a link to James's answer in my original response because he already offered a good explanation of the technical reasons why List<T> outperforms LinkedList<T> in general.

Basically, it's a matter of memory allocation and locality. When all of your collection's elements are stored in an array internally (as is the case with List<T>), it's all in one contiguous block of memory which can be accessed very quickly. This applies both to adding (as this simply writes to a location within the already-allocated array) as well as iterating (as this accesses many memory locations that are very close together rather than having to follow pointers to completely disconnected memory locations).

A LinkedList<T> is a specialized collection, which only outshines List<T> in the case where you are performing random insertions or removals from the middle of the list—and even then, only maybe.

As for the question of scaling: you're right, if big-O notation is all about how well an operation scales, then an O(1) operation should eventually beat out an O(>1) operation given a large enough input—which is obviously what you were going for with 20 million iterations.

This is why I mentioned that List<T>.Add has an amortized complexity of O(1). That means adding to a list is also an operation that scales linearly with the size of the input, the same (effectively) as with a linked list. Forget about the fact that occasionally the list has to resize itself (this is where the "amortized" comes in; I encourage you to visit that Wikipedia article if you haven't already). They scale the same.

Now, interestingly, and perhaps counter-intuitively, this means that if anything, the performance difference between List<T> and LinkedList<T> (again, when it comes to adding) actually becomes more obvious as the number of elements increases. The reason is that when the list runs out of space in its internal array, it doubles the size of the array; and thus with more and more elements, the frequency of resizing operations decreases—to the point where the array is basically never resizing.

So let's say a List<T> starts with an internal array large enough to hold 4 elements (I believe that's accurate, though I don't remember for sure). Then as you add up to 20 million elements, it resizes itself a total of ~(log2(20000000) - 1) or 23 times. Compare this to the 20 million times you're performing the considerably less efficient AddLast on a LinkedList<T>, which allocates a new LinkedListNode<T> with every call, and those 23 resizes suddenly seem pretty insignificant.

I hope this helps! If I haven't been clear on any points, let me know and I will do my best to clarify and/or correct myself.


James is right on.

Remember that big-O notation is meant to give you an idea of how the performance of an algorithm scales. It does not mean that something that performs in guaranteed O(1) time will outperform something else that performs in amortized O(1) time (as is the case with List<T>).

Suppose you have a choice of two jobs, one of which requires a commute 5 miles down a road that occasionally suffers from traffic jams. Ordinarily this drive should take you about 10 minutes, but on a bad day it could be more like 30 minutes. The other job is 60 miles away but the highway is always clear and never has any traffic jams. This drive always takes you an hour.

That's basically the situation with List<T> and LinkedList<T> for purposes of adding to the end of the list.