understanding MIPS csum_partial

Question

I'm trying to understand the csum_partial() function code located in arch/mips/lib/csum_partial.S in vanilla kernel 2.6.35. It looks like there's a bug in it in the case that the input length is less than 8 bytes. I know that it doesn't sound reasonable, that's why I'm asking here The function starts as following

/*
 * a0: source address
 * a1: length of the area to checksum
 * a2: partial checksum
 */

#define src a0

#define sum v0

    .text
    .set    noreorder
    .align  5
LEAF(csum_partial)
    move    sum, zero
    move    t7, zero

    sltiu   t8, a1, 0x8
    bnez    t8, .Lsmall_csumcpy     /* < 8 bytes to copy */
    move    t2, a1

In the case that input length is less than 8 we jump to small_csumcpy and don't reach the move command, right?. And there we have:

.Lsmall_csumcpy:

   move a1, t2
   ...

My question is where the t2 register is initialized?! Thanks a lot in advance!

markgz · Accepted Answer

The move in the following code is in a Branch Delay Slot

bnez    t8, .Lsmall_csumcpy     /* < 8 bytes to copy */
 move   t2, a1

The t2 register gets assigned before the branch is executed, so t2 has the correct value in .Lsmall_csumcpy:. The move instruction is indented in the code to show the reader that it is in a delay slot.

The assembler normally fills delay slots with NOPs, but due to the .set noreorder directive, the assembler assembles the instructions in this code in the exact order that they are written in.

Some MIPS simulators used in classrooms do not enable branch delay slots by default, so this code might not work correctly in such a simulator unless delay slots are enabled.

understanding MIPS csum_partial

Answers (1)

Related Questions