Reading from a file in assembly

jameshfisher picture jameshfisher · Jul 27, 2010 · Viewed 22.7k times · Source

I'm trying to learn assembly -- x86 in a Linux environment. The most useful tutorial I can find is Writing A Useful Program With NASM. The task I'm setting myself is simple: read a file and write it to stdout.

This is what I have:

section  .text              ; declaring our .text segment
  global  _start            ; telling where program execution should start

_start:                     ; this is where code starts getting exec'ed

  ; get the filename in ebx
    pop   ebx               ; argc
    pop   ebx               ; argv[0]
    pop   ebx               ; the first real arg, a filename

  ; open the file
    mov   eax,  5           ; open(
    mov   ecx,  0           ;   read-only mode
    int   80h               ; );

  ; read the file
    mov     eax,  3         ; read(
    mov     ebx,  eax       ;   file_descriptor,
    mov     ecx,  buf       ;   *buf,
    mov     edx,  bufsize   ;   *bufsize
    int     80h             ; );

  ; write to STDOUT
    mov     eax,  4         ; write(
    mov     ebx,  1         ;   STDOUT,
  ; mov     ecx,  buf       ;   *buf
    int     80h             ; );

  ; exit
    mov   eax,  1           ; exit(
    mov   ebx,  0           ;   0
    int   80h               ; );

A crucial problem here is that the tutorial never mentions how to create a buffer, the bufsize variable, or indeed variables at all.

How do I do this?

(An aside: after at least an hour of searching, I'm vaguely appalled at the low quality of resources for learning assembly. How on earth does any computer run when the only documentation is the hearsay traded on the 'net?)

Answer

Borealid picture Borealid · Jul 27, 2010

Ohh, this is going to be fun.

Assembly language doesn't have variables. Those are a higher-level language construct. In assembly language, if you want variables, you make them yourself. Uphill. Both ways. In the snow.

If you want a buffer, you're going to have to either use some region of your stack as the buffer (after calling the appropriate stack-frame-setup instructions), or use some region on the heap. If your heap is too small, you'll have to make a SYSCALL instruction (another INT 80h) to beg the operating system for more (via sbrk).

Another alternative is to learn about the ELF format and create a global variable in the appropriate section (I think it's .data).

The end result of any of these methods is a memory location you can use. But your only real "variables" like you're used to from the now-wonderful-seeming world of C are your registers. And there aren't very many of them.

The assembler might help you out with useful macros. Read the assembler documentation; I don't remember them off the top of my head.

Life is tough down there at the ASM level.