Reputation: 45
I'm quite new to MIPS so excuse me if this is a dumb question. I have the following exercise as an assignment and I would be very grateful if given any ideas, starting points.
I'm supposed to create a function that receives the arguments n(amount of numbers) and then n numbers, and then procedees to return the sum of said numbers and returns it in a stack. How would I start this? I'm thinking the function could have more than 4 numbers and the fact that the number of actual numbers varies is what gets me confused. The function argument should look like this:(int n, int number1, int number2....etc).
Could i store the numbers in a stack and then use the stack as a parameter in the function? If so, how would i do it?
UPDATE: So what i have in my mind as of now(with the help i received) looks like this:
sum:
addu $t3,$sp,16 #add to t3 address of sp+16
addu $a1,$a1,$a2 #adding sum to a1,a1 is first element a2 second and a3 third
addu $a1,$a1,$a3
li $t0,4 #start with i=4
bge $t0,$a0,end_for #while i<n
lw $v0,0($t3) #load v0 with value in stack
addu $a1,$v0,$a1 #add to sum
addi $t3,$t3,4 #increment stack and go up for next element
addi $t0,$t0,1
end_for:
li $v0,1
move $a0,$a0
syscall
jr $ra
I tried assembling it as it is but my MARS stops responding. Any clue as to why?
Upvotes: 0
Views: 392
Reputation: 366094
In the normal MIPS calling convention, args after the 4th will already be stored on the call stack, placed there by your caller.
The standard calling convention leaves padding before stack args, where you could store the register args to create a contiguous array of all the args. This PDF has a diagram, and see also MIPS function call with more than four arguments
This is normally called "shadow space" in x86-64 Windows. But since MIPS jal
doesn't store anything to memory (unlike x86 which pushes a return address on the stack, MIPS puts the return address in $lr
), even if the calling convention didn't include this shadow space a function could still adjust SP first and then store register args contiguous with stack args. So the only benefit I can see is giving tiny functions extra scratch space without having to adjust the stack pointer. This is less useful than on x86-64, where it isn't easily possible to create an array of args without it.
Or you could peel the first 3 sum iterations that handle $a1
.. $a3
(again assuming the standard MIPS calling convention with the first 4 args in registers, $a0
being int n
).
Then loop over stack args if you haven't got to n
yet.
You could write a C function and look at optimized compiler output, like this
#include <stdarg.h>
int sumargs(int n, ...) {
va_list args;
va_start(args, n);
int sum=0;
for (int i=0 ; i<n ; i++){
sum += va_arg(args, int);
}
va_end(args);
return sum;
}
va_start
and va_arg
aren't real functions; they'll expand to some inline code. va_start(args,n)
dumps the arg-passing registers after n
into the shadow space (contiguous with stack args, if any).
MIPS gcc unfortunately doesn't support the -mregnames
option to use names like $a0 and $t0, but google found a nice table of register name<->number
MIPS asm output from the Godbolt compiler explorer
# gcc5.4 -O3 -fno-delayed-branch
sumargs(int, ...):
# on entry: SP points 16 bytes below the first non-register arg, if there is one.
addiu $sp,$sp,-16 # reserve another 16 bytes
addiu $3,$sp,20 # create a pointer to the base of this array
sw $5,20($sp) # dump $a1..$a3 into the shadow space
sw $6,24($sp)
sw $7,28($sp)
sw $3,8($sp) # spill the pointer into scratch space for some reason?
blez $4,$L4 # check if the loop should run 0 times.
nop # branch-delay slot. (MARS can simulate a MIPS without delayed branches, so I told gcc to fill the slots with nops)
move $5,$0 # i=0
move $2,$0 # $v0 = sum = 0
$L3: # do {
lw $6,0($3)
addiu $5,$5,1 # i++
addu $2,$2,$6 # sum += *arg_pointer
addiu $3,$3,4 # arg_pointer++ (4 bytes)
bne $4,$5,$L3 # } while(i != n)
nop # fill the branch-delay slot
$L2:
addiu $sp,$sp,16
j $31 # return (with sum in $v0)
nop
$L4:
move $2,$0 # return 0
b $L2
nop
Looping on do {}while(--n)
would have been more efficient. It's a missed optimization that gcc doesn't do this when compiling the for loop.
Upvotes: 4