Reputation: 99
I'm trying to understand the csum_partial() function code located in arch/mips/lib/csum_partial.S in vanilla kernel 2.6.35. It looks like there's a bug in it in the case that the input length is less than 8 bytes. I know that it doesn't sound reasonable, that's why I'm asking here The function starts as following
/*
* a0: source address
* a1: length of the area to checksum
* a2: partial checksum
*/
#define src a0
#define sum v0
.text
.set noreorder
.align 5
LEAF(csum_partial)
move sum, zero
move t7, zero
sltiu t8, a1, 0x8
bnez t8, .Lsmall_csumcpy /* < 8 bytes to copy */
move t2, a1
In the case that input length is less than 8 we jump to small_csumcpy and don't reach the move command, right?. And there we have:
.Lsmall_csumcpy:
move a1, t2
...
My question is where the t2 register is initialized?! Thanks a lot in advance!
Upvotes: 0
Views: 361
Reputation: 6266
The move
in the following code is in a Branch Delay Slot
bnez t8, .Lsmall_csumcpy /* < 8 bytes to copy */
move t2, a1
The t2
register gets assigned before the branch is executed, so t2
has the correct value in .Lsmall_csumcpy:
.
The move
instruction is indented in the code to show the reader that it is in a delay slot.
The assembler normally fills delay slots with NOP
s, but due to the .set noreorder
directive, the assembler assembles the instructions in this code in the exact order that they are written in.
Some MIPS simulators used in classrooms do not enable branch delay slots by default, so this code might not work correctly in such a simulator unless delay slots are enabled.
Upvotes: 1