Say we have an unsorted array and linked list. The worst case when searching for an element for both data structures would be O( n ), but my question is:
Would the array still be way faster because of the use of spatial locality within the cache, or will the cache make use of branch locality allowing linked lists to be just as fast as any array ?
My understanding for an array is that if an element is accessed, that block of memory and many of the surrounding blocks are then brought into the cache allowing for much faster memory accesses.
My understanding for a linked list is that since the path that will be taken to traverse the list is predictable, then the cache will exploit that and still store the appropriate blocks of memory even though the nodes of the list can be far apart within the heap.
Your understanding of the the array case is mostly correct. If an array is accessed sequentially, many processors will not only fetch the block containing the element, but will also prefetch subsequent blocks to minimize cycles spent waiting on cache misses. If you are using an Intel x86 processor, you can find details about this in the Intel x86 optimization manual. Also, if the array elements are small enough, loading a block containing an element means the next element is likely in the same block.
Unfortunately, for linked lists the pattern of loads is unpredictable from the processor's point of view. It doesn't know that when loading an element at address X that the next address is the contents of (X + 8).
As a concrete example, the sequence of load addresses for a sequential array access is nice and predictable. For example, 1000, 1016, 1032, 1064, etc.
For a linked list it will look like: 1000, 3048, 5040, 7888, etc. Very hard to predict the next address.