Difference Between a Direct-Mapped Cache and Fully Associative Cache

madcrazydrumma picture madcrazydrumma · May 7, 2015 · Viewed 45.4k times · Source

I can't quite understand the main differences between the two caches and I was wondering if someone could help me out?

I know that with a fully associative cache an address can be stored on any line in the tag array and a direct-mapped cache can have only one address on one line.

But that's about all I know.

Answer

Maxim picture Maxim · Oct 14, 2015

In short you have basically answered your question. These are two different ways of organizing a cache (another one would be n-way set associative, which combines both, and most often used in real world CPU).

Direct-Mapped Cache is simplier (requires just one comparator and one multiplexer), as a result is cheaper and works faster. Given any address, it is easy to identify the single entry in cache, where it can be. A major drawback when using DM cache is called a conflict miss, when two different addresses correspond to one entry in the cache. Even if the cache is big and contains many stale entries, it can't simply evict those, because the position within cache is predetermined by the address.

Full Associative Cache is much more complex, and it allows to store an address into any entry. There is a price for that. In order to check if a particular address is in the cache, it has to compare all current entries (the tags to be exact). Besides in order to maintain temporal locality, it must have an eviction policy. Usually approximation of LRU (least recently used) is implemented, but it is also adds additional comparators and transistors into the scheme and of course consumes some time.

Fully associative caches are practical for small caches (for instance, the TLB caches on some Intel processors are fully associative) but those caches are small, really small. We are talking about a few dozen entries at most.

Even L1i and L1d caches are bigger and require a combined approach: a cache is divided into sets, and each set consists of "ways". Sets are directly mapped, and within itself are fully associative. The number of "ways" is usually small, for example in Intel Nehalem CPU there are 4-way (L1i), 8-way (L1d, L2) and 16-way (L3) sets. N-way set associative cache pretty much solves the problem of temporal locality and not that complex to be used in practice.

I would highly recommend a 2011 course by UC Berkeley, "Computer Science 61C", available on Archive. In addition to other stuff it contains 3 lectures about memory hierarchy and cache implementations.

There is also a 2015 edition of this course freely available on youtube.