Assume $t2=0x55555550
, then executing the following instruction:
andi $t2, $t2, -1
$t2 becomes 0x0005550
This is confirmed by the MIPS emulator1
However, it is not what I expected. I think the answer should be 0x55555550 & 0xFFFFFFFF = 0x55555550. I think the constant -1 was sign extended to 0xFFFFFFFF before the and logic. But it appears that the answer was 0x55555550 & 0x0000FFFF
Why -1 is sign extended to 0x0000FFFF instead of 0xFFFFFFFF
Footnote 1: Editor's note: MARS with "extended pseudo-instructions" enabled does expand this to multiple instructions to generate 0xffffffff
in a tmp register, thus leaving $t2
unchanged. Otherwise MARS and SPIM both reject it with an error as not encodeable. Other assemblers may differ.
Your expectation is correct, but your interpretation of your experimental results is not
$t2 becomes 0x0005550 This is confirmed by the MIPS emulator.
No, this is incorrect. So, one of the following:
0x55555550
in $t2
before the andi
as you assume, but 0x5550
instead (i.e.) your test program doesn't set up $t2
correctly.However, it is not what I expected. I think the answer should be 0x55555550 & 0xFFFFFFFF = 0x55555550. I think the constant -1 was sign extended to 0xFFFFFFFF before the and logic.
Yes, this is correct. And, I'll explain what is happening and why below.
But it appears that the answer was 0x55555550 & 0x0000FFFF. Why -1 is sign extended to 0x0000FFFF instead of 0xFFFFFFFF
It wasn't. It was sign extended to 0xFFFFFFFF
. Again, you're reading the experimental results incorrectly [or your test program has a bug].
mips
simulators and assemblers have pseudo ops.
These are instructions that may or may not exist as real, physical instructions. However, they are interpreted by the assembler to generate a sequence of physical/real instructions.
An example of a "pure" pseudo-op is li
("load immediate"). It has no corresponding instruction, but usually generates a two instruction sequence: lui
, ori
(which are physical instructions).
Pseudo-ops should not be confused with assembler directives, such as .text
, .data
, .word
, .eqv
, etc.
Some pseudo-ops can overlap with actual physical instructions. That is what is happening with your example.
In fact, the assembler examines any given instruction as a potential pseudo-op. It may determine that in can fulfill the intent with a single physical instruction. If not, it will generate a 1-3 instruction sequence and may use the [reserved] $at
register [which is $1
] as part of that sequence.
In mars
, to see the actual real instructions, look in the Basic
column of the source window.
For the sake of the completeness of my answer, all that follows is prefaced by the top comments.
I've created three example programs:
addi
as in your original postandi
as in your corrected postandi
that uses an unsigned argument(1) Here is the assembler source for your original question using addi
:
.text
.globl main
main:
li $t2,0x55555550
addi $t3,$t2,-1
nop
Here is how mars
interpreted it:
Address Code Basic Source
0x00400000 0x3c015555 lui $1,0x00005555 4 li $t2,0x55555550
0x00400004 0x342a5550 ori $10,$1,0x00005550
0x00400008 0x214bffff addi $11,$10,0xffffffff 5 addi $t3,$t2,-1
0x0040000c 0x00000000 nop 6 nop
addi
will sign extend its 16 bit immediate, so we have 0xFFFFFFFF
. Then, doing a two's complement add operation, we have a final result of 0x5555554F
Thus, the assembler didn't need to generate extra instructions for the addi
, so the addi
pseudo-op generated a single real addi
(2) Here is the andi
source:
.text
.globl main
main:
li $t2,0x55555550
andi $t3,$t2,-1
nop
Here is the assembly:
Address Code Basic Source
0x00400000 0x3c015555 lui $1,0x00005555 4 li $t2,0x55555550
0x00400004 0x342a5550 ori $10,$1,0x00005550
0x00400008 0x3c01ffff lui $1,0xffffffff 5 andi $t3,$t2,-1
0x0040000c 0x3421ffff ori $1,$1,0x0000ffff
0x00400010 0x01415824 and $11,$10,$1
0x00400014 0x00000000 nop 6 nop
Whoa! What happened? The andi
generated three instructions.
A real andi
instruction does not sign extend its immediate argument. So, the largest unsigned value we can use in a real andi
is 0xFFFF
But, by specifying -1
, we told the assembler that we did want sign extension (i.e. 0xFFFFFFFF
)
So, the assembler could not fulfull the intent with a single instruction and we get the sequence above. And the generated sequence could not use andi
but had to use the register form: and
. Here is the andi
generated code converted back into more friendly asm source:
lui $at,0xFFFF
ori $at,$at,0xFFFF
and $t3,$t2,$at
As to result, we're anding 0x55555550
and 0xFFFFFFFF
which is a [still unchanged] value of 0x55555550
(3) Here is the source for an unsigned version of andi
:
.text
.globl main
main:
li $t2,0x55555550
andi $t3,$t2,0xFFFF
nop
Here is the assembler output:
Address Code Basic Source
0x00400000 0x3c015555 lui $1,0x00005555 4 li $t2,0x55555550
0x00400004 0x342a5550 ori $10,$1,0x00005550
0x00400008 0x314bffff andi $11,$10,0x0000ffff 5 andi $t3,$t2,0xFFFF
0x0040000c 0x00000000 nop 6 nop
When the assembler sees that we're using a hex constant (i.e. the 0x
prefix), it tries to fulfill the value as an unsigned operation. So, it doesn't need to sign extend. And, the real andi
can fulfill the request.
The result of this is 0x5550
Note that if we had used a mask value of 0x1FFFF
, that would be unsigned. But, it's larger than 16 bits, so the assembler would generate a multi-instruction sequence to fulfill the request.
And, the result here would be 0x15550