I am having trouble with the creation and addressing of an array created purely in assembly using the instruction set for the Atmel ATMega8535.
What I understand so far is as follows:
What I am looking to do specifically is create a 1-D array of 8-bit integers with predefined values populating it during initialization it does not have to be written to, only addressed when needed. The problem ultimately lying in not being able to translate the logic into the assembly code.
I have tried with little progress to do so using support from the following books:
Any help, advice or further resources would be greatly appreciated.
If your array is read-only, you do not need to copy it to RAM. You can keep it in Flash and read it from there when needed. This will save you precious RAM, at the cost of slower access (read from RAM is 2 cycles, read from flash is 3 cycles).
You can declare your array like this:
.global my_array
.type my_array, @object
my_array:
.byte 12, 34, 56, 78
Then, to read a member of the array, you have to compute:
adress of member = array base address + member index
If your members were more than one byte, you would have to also multiply
the index by the size, but this is not the case here. Then, you put the
address of the required member in the Z register and issue an lpm
instruction. Here is a function implementing this logic:
.global read_data
; input: r24 = array index, r1 = 0
; output: r24 = array value
; clobbers: r30, r31
read_data:
ldi r30, lo8(my_array) ; load Z = address of my_array
ldi r31, hi8(my_array) ; ...high byte also
add r30, r24 ; add the array index
adc r31, r1 ; ...and add 0 to propagate the carry
lpm r24, Z
ret
@scottt advised you to first write in C, then look at the generated assembly. I consider this very good advice, let's follow it:
#include <stdint.h>
__flash const uint8_t my_array[] = {12, 34, 56, 78};
uint8_t read_data(uint8_t index)
{
return my_array[index];
}
The __flash
keyword identifying a “named address space” is an embedded
C extension supported by
gcc. The
generated assembly is slightly different from the previous one: instead
of computing base_address + index
, gcc does index − (−base_address)
:
read_data:
mov r30, r24 ; load Z = array index
ldi r31, 0 ; ...high byte of index is 0
subi r30, lo8(-(my_array)) ; subtract -(address of my array)
sbci r31, hi8(-(my_array)) ; ...high byte also
lpm r24, Z
ret
This is just as efficient as the previous hand-rolled assembly, except that it does not need the r1 register to be initialized to zero. But keeping r1 to zero is part of the gcc ABI anyway, so it should make no difference.
This section is meant to answer the question in the comment: how can we access the array if we do not know its address? The answer is: we access it by its name, just like in the code snippets above. Choosing the final address for the array, as well as replacing the name by the appropriate address, is the linker’s job.
Assembling (with avr-gcc -c
) and disassembling (with avr-objdump -d
)
the first code snippet gives this:
my_array.o, section .text:
00000000 <my_array>:
0: 0c 22 38 4e ."8N
If we were compiling from C, gcc would have put the array in the .progmem.data section instead of .text, but it makes little difference. The numbers “0c 22 38 4e” are the array contents, in hex. The characters to the right are the ASCII equivalents, ‘.’ being the placeholder for non printing characters.
The object file also carries this symbol table, shown by avr-nm
:
my_array.o:
00000000 T my_array
meaning the symbol “my_array” has been defined as referring to offset 0 into the .text section (implied by “T”) of this object.
Assembling and disassembling the second code snippet gives this:
read_data.o, section .text:
00000000 <read_data>:
0: e0 e0 ldi r30, 0x00
2: f0 e0 ldi r31, 0x00
4: e8 0f add r30, r24
6: f1 1d adc r31, r1
8: 84 91 lpm r24, Z
a: 08 95 ret
Comparing the disassembly with the actual source code, it can be seen
that the assembler replaced the address of my_array with 0x00, which is
almost guaranteed to be wrong. But it also left a note to the linker in
the form of “relocation records”, shown by avr-objdump -r
:
read_data.o, RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
00000000 R_AVR_LO8_LDI my_array
00000002 R_AVR_HI8_LDI my_array
This tells the linker that the ldi
instructions at offsets 0x00 and
0x02 are intended to load the low byte and the high byte (respectively)
of the final address of my_array. The object file also carries this
symbol table:
read_data.o:
U my_array
00000000 T read_data
where the “U” line means the file makes use of an undefined symbol named “my_array”.
Linking these pieces together, with a suitable main(), yields a binary containing the C runtime from avr-lbc, together with our code:
0000003c <my_array>:
3c: 0c 22 38 4e ."8N
00000040 <read_data>:
40: ec e3 ldi r30, 0x3C
42: f0 e0 ldi r31, 0x00
44: e8 0f add r30, r24
46: f1 1d adc r31, r1
48: 84 91 lpm r24, Z
4a: 08 95 ret
It should be noted that, not only has the linker moved the pieces around
to their final addresses, it has also fixed the arguments of the ldi
instructions so that they now point to the correct address of my_array.