opfball91
opfball91

Reputation: 21

How to get addresses for static data in MIPS running in bare mode (no la pseudo-instruction)

I'm simply trying to read in a string from the input from the console. I'm running xspim to simulate but need this to run in bare mode which means I can't use pseudo instructions, and much of the things I've found online are ways to do this with pseudo instructions. In all the documentation I'm reading it says to use the "la" instruction to store the string in $a0, however that instruction is not available to us. I read about what that instruction translates to, and it turns effectively into an "lui" followed by an "ori". The part that's throwing me is we're supposed to input the number of bytes between the first data location (always 0x 1000 0000) and the address of the first byte in the string. I'm not sure what the first byte of my string would be. Here's what I have:

.globl main
.globl done
.globl convert

.data
prompt:  .asciiz   "Enter a decimal number, to quit type 'quit':" #45
result:  .asciiz   "The number you entered is " #72
input:   .space    64

.text
convert:


main:       addi $v0, $0, 4         #Print prompt to enter number
            lui $a0, 0x1000         #Address of prompt
            syscall                 #Display prompt

            addi $v0, $0, 8         #Setting up syscall to read in string
            lui $at, 4097
            ori $a0, $at, input     #Where I want my string to be stored
            addi $a1, $0, 64        #How long my string will be
            syscall                 #Syscall to read in string

Upvotes: 2

Views: 1488

Answers (1)

Peter Cordes
Peter Cordes

Reputation: 364318

Normally assemblers + linkers for RISC machines support splitting addresses into two halves, so you can write lui $reg, upper(input) and ori $reg, $reg, lower(input), so addresses only have to be link-time constants, not assemble-time.

For example, if you look at MIPS gcc's assembly output on Godbolt (gcc -O3 -S, not disassembling the linked binary):

int my_global;
int *foo() { return &my_global; }
    lui     $2,%hi(my_global)
    j       $31
    addiu   $2,$2,%lo(my_global)    # branch-delay slot (SPIM doesn't have a branch-delay slot, but real MIPS does)

int bar() { return my_global; }
    lui     $2,%hi(my_global)
    lw      $2,%lo(my_global)($2)
    j       $31
    nop

    .section        .bss,"aw",@nobits
    .align  2
    .type   my_global, @object
    .size   my_global, 4
my_global:
    .space  4

Notice that bar used the lo half of the address as an offset in lw instead of generating the full address in a register and then using an offset of 0 in a load instruction.


A useful optimization if you know 2 addresses are in the same 64k block is to reuse the same lui result with different ori low-half constants. I think that's the case here for your data. Mars guarantees that syscall preserves all registers except the result; I assume SPIM is the same.


If you have to do it manually (without a linker to help you), then yes, you have to know the absolute address of your data.

In your case, there's no CRT startup code or anything else that puts its data in the .data section. The stuff in your .data section is at the very start of the data segment of your executable, so prompt: will have address 0x1000 0000.

You didn't ask for any padding or alignment, so you won't get any. Your data will be assembled into the output packed together. (Unlike C, where char prompt[45], result[]; have no guarantee of being contiguous.)

I haven't used SPIM, but hopefully it will let you write result-prompt and input-prompt.

e.g.

main:
        addi $v0, $0, 4         #Print prompt to enter number
        lui $a0, 0x1000         #Address of prompt
        syscall                 #Display prompt

       # $a0 still holds 0x1000 << 16
        addui $a0, $a0, input-prompt      #buffer address
        addi $a1, $0, 64                  # length

        addi $v0, $0, 8         #syscall 8 = read string
        syscall                 #read_string(input, 64)

        addui $t0, $a0, 0       # copy pointer to input
        addui $a0, $a0, result - input   # offset pointer again to point to the output message.

Or instead of copying the pointer to another register, you could optimize by adding input-result to the offset in every lb / sb you use to access input[].

Upvotes: 1

Related Questions