print ascii characters in mips

Shoham Alhanati picture Shoham Alhanati · Aug 12, 2017 · Viewed 10.9k times · Source

I'm taking a Computer Design course which is MIPS programming (we're using the MARS simulator).

We got an assignment and I got very confused. I'm new to this and having some problems. My task was as following: define in .data the following buf: .space 21 buf1: .space 20

get a 20-char long string from the user using syscall 8, and do the following: compare the ASCII values of buf[i] and buf[i+1] if it's positive, copy '+'to buf1, if it's negative do '-' and if it's equal do '=' in the end print buf1 and the number of '=' in buf1.

what I have so far is:

.data
buf: .space 21
buf1: .space 20
msg1: .asciiz "The number of identical char in a row is: "
#+ is 43 in ascii
#- is 45 in ascii
#= is 61 in ascii decimal
################# Code segment ####################
#
.text
.globl main
main:   # main program entry
    la $a0, buf 
    li $a1, 20
    li $v0, 8
    syscall

loop:  
    la $s0, buf
    lb $t0, 0($s0) #buf[0]
    addi $s0, $s0, 1 # buf++
    lb $t1, 0($s0) #buf[1]
    beqz $t0, exit #if null, terminate
    bgt $t0, $t1, greater
    blt $t0, $t1, lesser
    beq $t0, $t1, equal
greater:
    lb $t1, 43
    sb $t1, buf1
    la $a0, buf1
    li $v0, 4
    syscall

    j loop
lesser:

    j loop
equal:

    j loop
print:
    la $a0, buf1
    li $v0, 4
    syscall
exit:
    li $v0, 10       # Exit program
    syscall

so I am comparing buf[0] and buf[1] using $t0 and $t1 and doing bgt to greater label.

How do I "copy" the signs from ascii and print them? I noted to myself that '+' is 43 ASCII value. what do I do with it? how do I add it to buf1 and then print it in the end? I know the code incomplete but I will appreciate any help.

Thank you!

Answer

Ped7g picture Ped7g · Oct 10, 2017
main:   # main program entry
    la $a0, buf 
    li $a1, 20
    li $v0, 8
    syscall

Length argument should be 21 (check docs about "8" service of MARS, it will read n-1 characters at most, adding newline character if possible, and fill up the remaining memory with zeroes). That's why the definition of buf was suggested as .space 21 in the task description.

loop:  
    la $s0, buf

You will overwrite s0 source address every time you loop (how do you even manage to post that to SO ... didn't you run your code in debugger step-by-step, examining what is happening in registers and CPU?).

    lb $t0, 0($s0) #buf[0]
    addi $s0, $s0, 1 # buf++
    lb $t1, 0($s0) #buf[1]

As you don't use s0 anywhere else any more, you can avoid addi by doing lb $t1, 1($s0). Then again, once you will stop resetting the s0 each time, the addi may be handy again, so this is just pointing out the offset part does exist there in the memory operand syntax.

    beqz $t0, exit #if null, terminate

Consider also checking for newline character, unless you want to process that one too (looks like the task description does not explicitly state what to do about it, but from the wording it looks like only printable characters are meant to be processed).

Also you are checking the first character, if it is not zero terminator. But that's too late, as that means that the one iteration before that zero terminator was already in second character t1, so you did compare last character against zero value in previous loop.

    bgt $t0, $t1, greater
    blt $t0, $t1, lesser
    beq $t0, $t1, equal

BTW, bgt and blt are pseudo instructions. If you will check disassembly of the final machine code, you will see they translate into several MIPS instructions. Normally there's not much point for you to pay attention to this, especially if you are just starting with MIPS assembly, but in this special case the code you wrote will irk any experienced MIPS programmer.

For a start, do the beq first, because that's native MIPS instruction. Then you can do either bgt or blt, whichever you prefer. Finally avoid the third branch completely, because you can reach that point of code only in the remaining third case, so there's no point to verify that again with another branching instruction. Which saves actually two native instructions, because those greater/less branches are composed from two of them.

    lb $t1, 43

Loads byte value from address 43. You probably did want to use li $t1, 43, to load the '+' character value into t1 register. Also to keep the source easier to read, you may try to write directly li $t1, '+'. Most of the decent assemblers will convert that ASCII character into value for you, so you don't need to remember all the ASCII table in head.

    sb $t1, buf1

And this writes the buf1[0] character into memory. If you know the s0 starts with buf value, and the buf1 is 21 bytes away from it, you can abuse s0 value also for writing buf1 like sb $t1, 20($s0) (+20 only, because the addi increment of s0 happened during compare of values). Or if you want your code less cryptic, preload some other register with address of buf1, like la $s1, buf1, and use that one.

You can output whole buf1 string at the end, when it is finished (but write the +/-/= char at appropriate index, not overwriting buf1[0] all the time. And also add zero terminator after last (19th at most, figure out why) char.


How do I "copy" the signs from ascii and print them? I noted to myself that '+' is 43 ASCII value. what do I do with it? how do I add it to buf1 and then print it in the end?

You have to realize first what is "string" in MARS simulated machine. If you will run your code in debugger, you can see in "Execute" window the data segment content. Unfortunately I didn't find out how to switch it from "word" view to "byte" view in MARS, so you will have to understand what little-endian means, and how the memory looks on byte level, even when "word" values are displayed. You can at least switch on/off ASCII interpretation of values, which is still little-endian adjusted, so your msg1 string consist for example of bytes ... 20 6e 75 6d 62 65 72 ... which are ASCII characters " number". In the MARS as it will show "words" in debugger, the "numb" will be shown as "b m u n" (if the next word starts at 'n' address, in my source below it's actually landing a bit different, that part around "...e number..." lands as words "u n e" "r e b m").

So string in memory is continuous memory area, where each byte contains one ASCII value encoding single character. For output service v0=4 print string the final byte should contain zero, to let the MARS service know that is the end of string.

I.e. directive .asciiz "AB01" will assemble as five bytes defined in memory, values in hexa: 41 42 30 31 00

By this logic, the working code below does prepare the output string in the memory area starting at buf1 address (where 20 bytes of space are reserved by directive .space 20), by writing single byte after each comparison into "buf1[j]", incrementing then the pointer (like ++j, but actually whole pointer is adjusted, the index "j" is just fictional in comments).

After all input characters are compared, the final zero terminator is written to the end of the output string, then the syscall, v0=4 print string service can be used to output it.


So after applying all my own advices and my l33t-sk1llz, I ended with this program:

.data
buf: .space 21
buf1: .space 20
msg1: .asciiz "\nThe number of identical char in a row is: "
msg1_end:

.text
.globl main
main:   # main program entry
    la $s0, buf     # load "buf" address into s0, also for later usage
    # let the user input 20 character long string into memory (at address "buf")
    move $a0, $s0
    li $a1, 21
    li $v0, 8
    syscall

The beginning is almost identical with yours, except fixing the maximum input length and setting up s0 early.

    # set up registers before main processing loop (s0 is already "buf" address)
    la $t2, buf1   # output string address
    li $t3, '\n'   # newline character in t2
    li $t4, '+'
    li $t5, '-'
    li $t6, '='
    move $s1, $zero # s1 will be counter of equal chars

I did put all the ASCII characters into registers, as I had enough spare registers to be used, so I don't have to load those characters by li ahead of each sb when writing the output string.

    # compare each inputted character with next one, and produce output string
compareEachCharInInput:
    lb $t0, 0($s0)   # buf[i]
    addi $s0, $s0, 1 # ++i
    lb $t1, 0($s0)   # buf[i+1]
    beq $t1, $zero, outputResultNL   # if later char is zero, end processing
    beq $t1, $t3, outputResult # if later char is newline, end processing
    beq $t0, $t1, equal
    blt $t0, $t1, lesser
# greater case
    sb $t4, ($t2)    # buf1[j] = '+'
    addi $t2, $t2, 1 # ++j
    j compareEachCharInInput
lesser:
    sb $t5, ($t2)    # buf1[j] = '-'
    addi $t2, $t2, 1 # ++j
    j compareEachCharInInput
equal:
    sb $t6, ($t2)    # buf1[j] = '='
    addi $t2, $t2, 1 # ++j
    addi $s1, $s1, 1 # ++equal_counter
    j compareEachCharInInput

outputResultNL:      # output newline character, when user did enter full 20 characters
    move $a0, $t3    # reuse the newline char in t3 (still there)
    li $v0, 11
    syscall          # because the output for less than 20 chars contains \n from input
outputResult:
    sb $zero, ($t2)  # add zero terminator to output string

"errata:" Why t2 is still valid here... I did added the "outputResultNL" feature with additional newline char outputted quite late into programming, and didn't plan according to it, so the t2 was still used as valid pointer after that syscall. But the MIPS calling convention does not request the subroutines to preserve values in $tX registers. But syscall looks to be special case, according to MARS documentation: "MIPS register contents are not affected by a system call, except for result registers as specified in the table below." - so this code is still valid even with t2 usage (in the remaining code I avoided $tX register usage when it was not fully under my control, i.e. using them only between syscall calls, otherwise using $sX registers for values which I want to have preserved). Turns out it was paranoid precaution, but at least this paragraph may help you to understand the difference and what's the purpose of "calling convention", if you will have to produce some kind of public subroutine for MIPS.

    la $a0, buf1     # output string address
    li $v0, 4
    syscall
    # output number of identical neighbour chars
    la $a0, msg1
    li $v0, 4
    syscall          # label outputted
    move $a0, $s1
    li $v0, 1
    syscall          # counter outputted
    # Exit program
    li $v0, 10
    syscall

Example input + output:

11223344556677889900
=-=-=-=-=-=-=-=-=+=
The number of identical char in a row is: 10

Also note that the "identical chars in a row" is actually number of '=' signs in output string, so not "chars in row" exactly, which is more accurate by the task description:

in the end print buf1 and the number of '=' in buf1

So I would suggest to fix the label text too.