In 8086 assembly programming, we can only load a data into a segment register by, first loading it into a general purpose register and then we have to move it from this general register to the segment register.
Why can't we load it directly? Is there any special reason for not being allowed?
What is the difference between mov ax,5000H
and mov ax,[5000H]
? Does [5000h]
mean content in memory location 5000h?
Remember that the syntax of assembly language (any assembly) is just a human-readable way to write machine code. The rules of what you can do in machine code depend on how the processor's electronics were designed, not on what the assembler syntax could easily support.
So, just because it looks like you could write mov DS, [5000h]
and that conceptually it doesn't seem like there is a reason why you shouldn't be able to do it, it's really about "is there a mechanism by which the processor can load a segment register from a memory location's content?"
In the case of 8086 assembly, I figure that the reason is simply that the engineers just didn't create an electric path that could feed a signal from the memory I/O data lines to the lines that write to the segment registers.
Why? I have several theories, but no authoritative knowledge.
The most likely reason is simply one of simplifying the design: it takes extra wiring and gates to do that, and it's an uncommon enough operation (this is the 70's) that it's not worth the real estate in the chip. This is not surprising; the 8086 already went overboard allowing any of the normal registers to be connected to the ALU (arithmetic logic unit) which allows any register to be used as an accumulator. I'm sure that wasn't cheap to do. Most processors at the time only allowed one register (the accumulator) to be used for that purpose.
It's also possible that allowing a segment register to be written from a memory read resulted in several weird edge cases that were hard to get right in the circuitry. After all, the segment register to be written might be used to address the source operand.
As far as the brackets, you are correct. Let's say memory position 5000h contains the number 4321h. mov ax, 5000h
puts the value 5000h into ax, while mov ax, [5000h]
loads 4321h from memory into ax. Essentially, the brackets act like the *
pointer dereference operator in C.
Just to highlight the fact that assembly is an idealized abstraction of what machine code can do, you should note that the two variations are not the same instruction with different parameters, but completely different opcodes. They could have used – say – MOV
for the first and MVD
(MoVe Direct addressed memory) for the second opcode, but they must have decided that the bracket syntax was easier for programmers to remember.