I'm taking a Computer Design course which is MIPS programming (we're using the MARS simulator).
We got an assignment and I got very confused. I'm new to this and having some problems. My task was as following: define in .data the following buf: .space 21 buf1: .space 20
get a 20-char long string from the user using syscall 8, and do the following: compare the ASCII values of buf[i] and buf[i+1] if it's positive, copy '+'to buf1, if it's negative do '-' and if it's equal do '=' in the end print buf1 and the number of '=' in buf1.
what I have so far is:
.data
buf: .space 21
buf1: .space 20
msg1: .asciiz "The number of identical char in a row is: "
#+ is 43 in ascii
#- is 45 in ascii
#= is 61 in ascii decimal
################# Code segment ####################
#
.text
.globl main
main: # main program entry
la $a0, buf
li $a1, 20
li $v0, 8
syscall
loop:
la $s0, buf
lb $t0, 0($s0) #buf[0]
addi $s0, $s0, 1 # buf++
lb $t1, 0($s0) #buf[1]
beqz $t0, exit #if null, terminate
bgt $t0, $t1, greater
blt $t0, $t1, lesser
beq $t0, $t1, equal
greater:
lb $t1, 43
sb $t1, buf1
la $a0, buf1
li $v0, 4
syscall
j loop
lesser:
j loop
equal:
j loop
print:
la $a0, buf1
li $v0, 4
syscall
exit:
li $v0, 10 # Exit program
syscall
so I am comparing buf[0] and buf[1] using $t0 and $t1 and doing bgt to greater label.
How do I "copy" the signs from ascii and print them? I noted to myself that '+' is 43 ASCII value. what do I do with it? how do I add it to buf1 and then print it in the end? I know the code incomplete but I will appreciate any help.
Thank you!
main: # main program entry
la $a0, buf
li $a1, 20
li $v0, 8
syscall
Length argument should be 21 (check docs about "8" service of MARS, it will read n-1 characters at most, adding newline character if possible, and fill up the remaining memory with zeroes). That's why the definition of buf
was suggested as .space 21
in the task description.
loop:
la $s0, buf
You will overwrite s0
source address every time you loop (how do you even manage to post that to SO ... didn't you run your code in debugger step-by-step, examining what is happening in registers and CPU?).
lb $t0, 0($s0) #buf[0]
addi $s0, $s0, 1 # buf++
lb $t1, 0($s0) #buf[1]
As you don't use s0
anywhere else any more, you can avoid addi
by doing lb $t1, 1($s0)
. Then again, once you will stop resetting the s0
each time, the addi
may be handy again, so this is just pointing out the offset part does exist there in the memory operand syntax.
beqz $t0, exit #if null, terminate
Consider also checking for newline character, unless you want to process that one too (looks like the task description does not explicitly state what to do about it, but from the wording it looks like only printable characters are meant to be processed).
Also you are checking the first character, if it is not zero terminator. But that's too late, as that means that the one iteration before that zero terminator was already in second character t1
, so you did compare last character against zero value in previous loop.
bgt $t0, $t1, greater
blt $t0, $t1, lesser
beq $t0, $t1, equal
BTW, bgt
and blt
are pseudo instructions. If you will check disassembly of the final machine code, you will see they translate into several MIPS instructions. Normally there's not much point for you to pay attention to this, especially if you are just starting with MIPS assembly, but in this special case the code you wrote will irk any experienced MIPS programmer.
For a start, do the beq
first, because that's native MIPS instruction.
Then you can do either bgt
or blt
, whichever you prefer. Finally avoid the third branch completely, because you can reach that point of code only in the remaining third case, so there's no point to verify that again with another branching instruction. Which saves actually two native instructions, because those greater/less branches are composed from two of them.
lb $t1, 43
Loads byte value from address 43. You probably did want to use li $t1, 43
, to load the '+'
character value into t1
register. Also to keep the source easier to read, you may try to write directly li $t1, '+'
. Most of the decent assemblers will convert that ASCII character into value for you, so you don't need to remember all the ASCII table in head.
sb $t1, buf1
And this writes the buf1[0]
character into memory. If you know the s0
starts with buf
value, and the buf1
is 21 bytes away from it, you can abuse s0
value also for writing buf1
like sb $t1, 20($s0)
(+20 only, because the addi
increment of s0
happened during compare of values). Or if you want your code less cryptic, preload some other register with address of buf1
, like la $s1, buf1
, and use that one.
You can output whole buf1
string at the end, when it is finished (but write the +/-/=
char at appropriate index, not overwriting buf1[0]
all the time. And also add zero terminator after last (19th at most, figure out why) char.
How do I "copy" the signs from ascii and print them? I noted to myself that '+' is 43 ASCII value. what do I do with it? how do I add it to buf1 and then print it in the end?
You have to realize first what is "string" in MARS simulated machine. If you will run your code in debugger, you can see in "Execute" window the data segment content. Unfortunately I didn't find out how to switch it from "word" view to "byte" view in MARS, so you will have to understand what little-endian means, and how the memory looks on byte level, even when "word" values are displayed. You can at least switch on/off ASCII interpretation of values, which is still little-endian adjusted, so your msg1
string consist for example of bytes ... 20 6e 75 6d 62 65 72 ...
which are ASCII characters " number"
. In the MARS as it will show "words" in debugger, the "numb" will be shown as "b m u n" (if the next word starts at 'n'
address, in my source below it's actually landing a bit different, that part around "...e number..."
lands as words "u n e"
"r e b m"
).
So string in memory is continuous memory area, where each byte contains one ASCII value encoding single character. For output service v0=4
print string the final byte should contain zero, to let the MARS service know that is the end of string.
I.e. directive .asciiz "AB01"
will assemble as five bytes defined in memory, values in hexa: 41 42 30 31 00
By this logic, the working code below does prepare the output string in the memory area starting at buf1
address (where 20 bytes of space are reserved by directive .space 20
), by writing single byte after each comparison into "buf1[j]
", incrementing then the pointer (like ++j
, but actually whole pointer is adjusted, the index "j" is just fictional in comments).
After all input characters are compared, the final zero terminator is written to the end of the output string, then the syscall, v0=4
print string service can be used to output it.
So after applying all my own advices and my l33t-sk1llz, I ended with this program:
.data
buf: .space 21
buf1: .space 20
msg1: .asciiz "\nThe number of identical char in a row is: "
msg1_end:
.text
.globl main
main: # main program entry
la $s0, buf # load "buf" address into s0, also for later usage
# let the user input 20 character long string into memory (at address "buf")
move $a0, $s0
li $a1, 21
li $v0, 8
syscall
The beginning is almost identical with yours, except fixing the maximum input length and setting up s0
early.
# set up registers before main processing loop (s0 is already "buf" address)
la $t2, buf1 # output string address
li $t3, '\n' # newline character in t2
li $t4, '+'
li $t5, '-'
li $t6, '='
move $s1, $zero # s1 will be counter of equal chars
I did put all the ASCII characters into registers, as I had enough spare registers to be used, so I don't have to load those characters by li
ahead of each sb
when writing the output string.
# compare each inputted character with next one, and produce output string
compareEachCharInInput:
lb $t0, 0($s0) # buf[i]
addi $s0, $s0, 1 # ++i
lb $t1, 0($s0) # buf[i+1]
beq $t1, $zero, outputResultNL # if later char is zero, end processing
beq $t1, $t3, outputResult # if later char is newline, end processing
beq $t0, $t1, equal
blt $t0, $t1, lesser
# greater case
sb $t4, ($t2) # buf1[j] = '+'
addi $t2, $t2, 1 # ++j
j compareEachCharInInput
lesser:
sb $t5, ($t2) # buf1[j] = '-'
addi $t2, $t2, 1 # ++j
j compareEachCharInInput
equal:
sb $t6, ($t2) # buf1[j] = '='
addi $t2, $t2, 1 # ++j
addi $s1, $s1, 1 # ++equal_counter
j compareEachCharInInput
outputResultNL: # output newline character, when user did enter full 20 characters
move $a0, $t3 # reuse the newline char in t3 (still there)
li $v0, 11
syscall # because the output for less than 20 chars contains \n from input
outputResult:
sb $zero, ($t2) # add zero terminator to output string
"errata:" Why t2
is still valid here... I did added the "outputResultNL
" feature with additional newline char outputted quite late into programming, and didn't plan according to it, so the t2
was still used as valid pointer after that syscall
. But the MIPS calling convention does not request the subroutines to preserve values in $tX
registers. But syscall
looks to be special case, according to MARS documentation: "MIPS register contents are not affected by a system call, except for result registers as specified in the table below." - so this code is still valid even with t2
usage (in the remaining code I avoided $tX
register usage when it was not fully under my control, i.e. using them only between syscall
calls, otherwise using $sX
registers for values which I want to have preserved). Turns out it was paranoid precaution, but at least this paragraph may help you to understand the difference and what's the purpose of "calling convention", if you will have to produce some kind of public subroutine for MIPS.
la $a0, buf1 # output string address
li $v0, 4
syscall
# output number of identical neighbour chars
la $a0, msg1
li $v0, 4
syscall # label outputted
move $a0, $s1
li $v0, 1
syscall # counter outputted
# Exit program
li $v0, 10
syscall
Example input + output:
11223344556677889900
=-=-=-=-=-=-=-=-=+=
The number of identical char in a row is: 10
Also note that the "identical chars in a row" is actually number of '='
signs in output string, so not "chars in row" exactly, which is more accurate by the task description:
in the end print buf1 and the number of '=' in buf1
So I would suggest to fix the label text too.