Creation and addressing arrays in AVR Assembly (Using the ATMega8535)

Daniel Dunn picture Daniel Dunn · Mar 8, 2015 · Viewed 12.3k times · Source

I am having trouble with the creation and addressing of an array created purely in assembly using the instruction set for the Atmel ATMega8535.

What I understand so far is as follows:

  • The array contains contiguous data that is equal in length.
  • The creation of the array involves defining the beginning and end locations of the array (much like you would the stack).
  • You would address an index in the array by adding an offset of the base address of the array.

What I am looking to do specifically is create a 1-D array of 8-bit integers with predefined values populating it during initialization it does not have to be written to, only addressed when needed. The problem ultimately lying in not being able to translate the logic into the assembly code.

I have tried with little progress to do so using support from the following books:

  • Some Assembly Required: Assembly Language Programming with the AVR Microcontroller by Timothy S Margush
  • Get Going with...AVR Microcontrollers by Peter Sharpe

Any help, advice or further resources would be greatly appreciated.

Answer

Edgar Bonet picture Edgar Bonet · Mar 10, 2015

If your array is read-only, you do not need to copy it to RAM. You can keep it in Flash and read it from there when needed. This will save you precious RAM, at the cost of slower access (read from RAM is 2 cycles, read from flash is 3 cycles).

You can declare your array like this:

.global my_array
.type   my_array, @object
my_array:
    .byte 12, 34, 56, 78

Then, to read a member of the array, you have to compute:

adress of member = array base address + member index

If your members were more than one byte, you would have to also multiply the index by the size, but this is not the case here. Then, you put the address of the required member in the Z register and issue an lpm instruction. Here is a function implementing this logic:

.global read_data
; input:    r24 = array index, r1 = 0
; output:   r24 = array value
; clobbers: r30, r31
read_data:
    ldi r30, lo8(my_array)  ; load Z = address of my_array
    ldi r31, hi8(my_array)  ; ...high byte also
    add r30, r24            ; add the array index
    adc r31, r1             ; ...and add 0 to propagate the carry
    lpm r24, Z
    ret

@scottt advised you to first write in C, then look at the generated assembly. I consider this very good advice, let's follow it:

#include <stdint.h>

__flash const uint8_t my_array[] = {12, 34, 56, 78};

uint8_t read_data(uint8_t index)
{
    return my_array[index];
}

The __flash keyword identifying a “named address space” is an embedded C extension supported by gcc. The generated assembly is slightly different from the previous one: instead of computing base_address + index, gcc does index − (−base_address):

read_data:
    mov r30, r24                ; load Z = array index
    ldi r31, 0                  ; ...high byte of index is 0
    subi r30, lo8(-(my_array))  ; subtract -(address of my array)
    sbci r31, hi8(-(my_array))  ; ...high byte also
    lpm r24, Z
    ret

This is just as efficient as the previous hand-rolled assembly, except that it does not need the r1 register to be initialized to zero. But keeping r1 to zero is part of the gcc ABI anyway, so it should make no difference.

The role of the linker

This section is meant to answer the question in the comment: how can we access the array if we do not know its address? The answer is: we access it by its name, just like in the code snippets above. Choosing the final address for the array, as well as replacing the name by the appropriate address, is the linker’s job.

Assembling (with avr-gcc -c) and disassembling (with avr-objdump -d) the first code snippet gives this:

my_array.o, section .text:
00000000 <my_array>:
   0:   0c 22 38 4e        ."8N

If we were compiling from C, gcc would have put the array in the .progmem.data section instead of .text, but it makes little difference. The numbers “0c 22 38 4e” are the array contents, in hex. The characters to the right are the ASCII equivalents, ‘.’ being the placeholder for non printing characters.

The object file also carries this symbol table, shown by avr-nm:

my_array.o:
00000000 T my_array

meaning the symbol “my_array” has been defined as referring to offset 0 into the .text section (implied by “T”) of this object.

Assembling and disassembling the second code snippet gives this:

read_data.o, section .text:
00000000 <read_data>:
   0:   e0 e0        ldi r30, 0x00
   2:   f0 e0        ldi r31, 0x00
   4:   e8 0f        add r30, r24
   6:   f1 1d        adc r31, r1
   8:   84 91        lpm r24, Z
   a:   08 95        ret

Comparing the disassembly with the actual source code, it can be seen that the assembler replaced the address of my_array with 0x00, which is almost guaranteed to be wrong. But it also left a note to the linker in the form of “relocation records”, shown by avr-objdump -r:

read_data.o, RELOCATION RECORDS FOR [.text]:
OFFSET   TYPE              VALUE 
00000000 R_AVR_LO8_LDI     my_array
00000002 R_AVR_HI8_LDI     my_array

This tells the linker that the ldi instructions at offsets 0x00 and 0x02 are intended to load the low byte and the high byte (respectively) of the final address of my_array. The object file also carries this symbol table:

read_data.o:
         U my_array
00000000 T read_data

where the “U” line means the file makes use of an undefined symbol named “my_array”.

Linking these pieces together, with a suitable main(), yields a binary containing the C runtime from avr-lbc, together with our code:

0000003c <my_array>:
  3c:   0c 22 38 4e        ."8N

00000040 <read_data>:
  40:   ec e3        ldi r30, 0x3C
  42:   f0 e0        ldi r31, 0x00
  44:   e8 0f        add r30, r24
  46:   f1 1d        adc r31, r1
  48:   84 91        lpm r24, Z
  4a:   08 95        ret

It should be noted that, not only has the linker moved the pieces around to their final addresses, it has also fixed the arguments of the ldi instructions so that they now point to the correct address of my_array.