What are the segment and offset in real mode memory addressing?

narayanpatra picture narayanpatra · Nov 7, 2010 · Viewed 13.4k times · Source

I am reading about memory addressing. I read about segment offset and then about descriptor offset. I know how to calculate the exact addresses in real mode. All this is OK, but I am unable to understand what exactly offset is? Everywhere I read:

In real mode, the registers are only 16 bits, so you can only address up to 64k. In order to allow addressing of more memory, addresses are calculated from segment * 16 + offset.

Here I can understand the first line. We have 16 bits, so we can address up to 2^16 = 64k.

But what is this second line? What the segment represent? Why we multiply it with 16? why we add offset. I just can't understand what this offset is? Can anybody explain me or give me link for this please?

Answer

cHao picture cHao · Nov 7, 2010

When Intel was building the 8086, there was a valid case for having more than 64KB in a machine, but there was no way it'd ever use a 32-bit address space. Back then, even a megabyte was a whole lot of memory. (Remember the infamous quote "640K ought to be enough for anybody"? It's essentially a mistranslation of the fact that back then, 1MB was freaking huge.) The word "gigabyte" wouldn't be in common use for another 15-20 years, and it wouldn't be referring to RAM for like another 5-10 years after that.

So instead of implementing an address space so huge that it'd "never" be fully utilized, what they did was implement 20-bit addresses. They still used 16-bit words for addresses, because after all, this is a 16-bit processor. The upper word was the "segment" and the lower word was the "offset". The two parts overlapped considerably, though -- a "segment" is a 64KB chunk of memory that starts at (segment) * 16, and the "offset" can point anywhere within that chunk. In order to calculate the actual address, you multiply the segment part of the address by 16 (or shift it left by 4 bits...same thing), and then add the offset. When you're done, you have a 20-bit address.

 19           4  0
  +--+--+--+--+
  |  segment  |
  +--+--+--+--+--+
     |   offset  |
     +--+--+--+--+

For example, if the segment were 0x8000, and the offset were 0x0100, the actual address comes out to ((0x8000 << 4) + 0x0100) == 0x80100.

   8  0  0  0
      0  1  0  0
  ---------------
   8  0  1  0  0

The math is rarely that neat, though -- 0x80100 can be represented by literally thousands of different segment:offset combinations (4096, if my math is right).