Peter
Peter

Reputation: 31

What is this assembly instruction doing?

This is a snippet of an assignment regarding data hazards, but I am struggling to understand what it is doing. words beind : are my understanding of the instructions

loop:
1. 𝐴𝐷𝐷𝐼 𝑅2, 𝑅2, 1  : Add 1 to R2
2. 𝐿𝐷 𝑅4, 0(𝑅3)    : Load data at R3 address into R4 (?)
3. 𝐿𝐷 𝑅5, 4(𝑅3)    : Load data at R3 address into R5 (?)
4. SLT 𝑅6, 𝑅4, 𝑅5  : Set R6 = R4 < R5 ? 0 : 1
5. SD 𝑅6, 8(R3)     :  Store data in R6 at R3 address (?)
6. ADDI 𝑅3, 𝑅3, 1  : Add 1 to R3 (?)
7. 𝐡𝑁𝐸𝑍 𝑅2, π‘™π‘œπ‘œπ‘.  : If R2 == 0, goto 1, else proceed to 8
8. ADD R11, R12, R13.: ???

Notes

Questions

Thanks all!

Found this community-effort cheatsheet, and IBM docs

Upvotes: 1

Views: 841

Answers (1)

puppydrum64
puppydrum64

Reputation: 1688

I've edited the list with what each instruction does:

1. 𝐴𝐷𝐷𝐼 𝑅2, 𝑅2, 1  : Add 1 to R2 and store the result in R2
2. 𝐿𝐷 𝑅4, 0(𝑅3)    : Load data at R3 address into R4.
3. 𝐿𝐷 𝑅5, 4(𝑅3)    : Load data at (R3 address + 4) into R5
4. SLT 𝑅6, 𝑅4, 𝑅5  : R4 < R5 ? R6 = 1 : Do Nothing
5. SD 𝑅6, 8(R3)     :  Store data in R6 at (R3 address + 8)
6. ADDI 𝑅3, 𝑅3, 1  : Add 1 to R3 (?)
7. 𝐡𝑁𝐸𝑍 𝑅2, π‘™π‘œπ‘œπ‘.  : If R2 == 0, goto 1, else proceed to 8
8. ADD R11, R12, R13.: Add R13 to R12 and store the result in R11.
  • The number in front of (R3) is a temporary offset that is added to R3 before it is dereferenced. In other words, LD R5,4(R3) has the same effect on registers as:
ADDI R3,R3,4  ;add 4 to the value in R3, and store the result in R3
LD R5,(R3)    ;treating the value in R3 as a memory address, 
              ;dereference it and store the integer at that address into R5
SUBI R3,R3,4  ;return R3 to its original state.

Except this all happens in one instruction and no modification to R3 actually takes place.

  • The initial value of R3 isn't really important in the same way as R2 is. R2 is being used as a loop counter whereas R3 is being used as a pointer to memory (what it's pointing to, I have no idea).
  • As for the relevance to data hazards, the hazard is here:
    ADDI R3,R3, 1  : Add 1 to R3 and store the result in R3.

Presumably, R3 is intended to point to a 32-bit integer. This is implied by all the offsets being in multiples of four. For illustration purposes, let's pretend that at the start of the loop, R3 = 0x40000000. All the bytes stored at these addresses are made up by me, with the exception of bytes stored at 0x40000008-0x4000000B, which were written to memory by the instruction SD R6,8(R3). (I'm assuming a big-endian architecture hence the byte order.)

0x40000000: 0xDE
0x40000001: 0xAD
0x40000002: 0xBE
0x40000003: 0xEF

0x40000004: 0x12
0x40000005: 0x34
0x40000006: 0x56
0x40000007: 0x78

0x40000008: 0x00
0x40000009: 0x00
0x4000000A: 0x00
0x4000000B: 0x01

After instruction 6 in your list executes, R4 contains 0xDEADBEEF and R5 contains 0x12345678. That's fine, but the problem is we added 1 to R3 instead of 4. This means that the numbers we're loading into R4 and R5 on the subsequent passes through the loop weren't the intended data, but rather junk that was Frankensteined together from different values. Here's what we have after the second pass:

0x40000000: 0xDE
0x40000001: 0xAD
0x40000002: 0xBE
0x40000003: 0xEF

0x40000004: 0x12
0x40000005: 0x34
0x40000006: 0x56
0x40000007: 0x78

0x40000008: 0x00
0x40000009: 0x00
0x4000000A: 0x00
0x4000000B: 0x00

0x4000000C: 0x01

Here, R4 = 0xADBEEF12 and R5 = 0x34567800. In order to correctly iterate through memory, we need to change ADDI R3,R3, 1 to ADDI R3,R3, 4.

Now why would the CPU even let you do this? Well, some CPUs actually don't, and will fault if you try to write to an unaligned address. Others, like x86, aren't so picky. As it turns out, the CPU has no idea what type your data is, and relies on the programmer or compiler to enforce type rules.

Upvotes: 1

Related Questions