Reputation: 16709
I just handed in this function in an assignment. It is done (hence no homework tag). But I would like to see how this can be improved.
Essentially, the function sums the squares of all the integers between 1 and the given number, using the following formula:
n(n+1)(2n+1)/6
Where n
is the maximum number.
The function below is made to catch any overflow and return 0 should any occur.
UInt32 sumSquares(const UInt32 number)
{
int result = 0;
__asm
{
mov eax, number //move number in eax
mov edx, 2 //move 2 in edx
mul edx //multiply (2n)
jo end //jump to end if overflow
add eax, 1 //addition (2n+1)
jo end //jump to end if overflow
mov ecx, eax //move (2n+1) in ecx
mov ebx, number //move number in ebx
add ebx, 1 //addition (n+1)
jo end //jump to end if overflow
mov eax, number //move number in eax for multiplication
mul ebx //multiply n(n+1)
jo end //jump to end if overflow
mul ecx //multiply n(n+1)(2n+1)
jo end //jump to end if overflow
mov ebx, 6 //move 6 in ebx
div ebx //divide by 6, the result will be in eax
mov result, eax //move eax in result
end:
}
return result;
}
Basically, I want to know what I can improve in there. In terms of best-practices mostly. One thing sounds obvious: smarter overflow check (with a single check for whatever maximum input would cause an overflow).
Upvotes: 5
Views: 796
Reputation: 4966
UInt32 sumSquares(const UInt32 number)
{
__asm
{
mov eax, number // n
cmd eax, MAX_VALUE
jg bad_value
lea ebx, [eax+1] // n + 1
lea ecx, [2*eax+1] // 2n + 1
mul ebx
mul ecx
shr eax, 1 // divide by 2
mov ebx, 2863311531 // inverse of 3
mul ebx // divide by 3
ret
bad_value:
xor eax, eax // return 0
ret
}
}
http://blogs.msdn.com/devdev/archive/2005/12/12/502980.aspx
Spara
Upvotes: 3
Reputation: 4996
mov eax, number ; = n
cmp eax, 0x928 ; cannot handle n >= 0x928
jnc end
shl eax, 1 ; = n(2)
add eax, 3 ; = n(2)+3
mov ebx, number
mul ebx ; = n(n(2)+3)
add eax, 1 ; = n(n(2)+3)+1
mov ebx, number
mul ebx ; = n(n(n(2)+3)+1) = n(n+1)(2n+1)
mov ebx, 6
div ebx
mov result, eax
Rather than checking for overflow, this solution checks the input against the known maximum value that the function can handle. Note that the last multiplication is allowed to overflow, and it will overflow for any input number
greater than 0x509
. Checking against a known value rather than relying on overflow checks allows the function to handle almost twice as many input values. In fact, the function is able to handle every input whose result fits within 32 bits.
Upvotes: 3
Reputation: 41220
mov eax, number //move number in eax
mov ecx, eax //dup in ecx
mul ecx //multiply (n*n)
jo end //jump to end if overflow
add eax, ecx //addition (n*n+n); can't overflow
add ecx, ecx //addition (2n); can't overflow
add ecx, 1 //addition (2n+1); can't overflow
mul ecx //multiply (n*n+n)(2n+1)
jo end //jump to end if overflow
mov ecx, 6 //move 6 in ebx
div ecx //divide by 6, the result will be in eax
mov result, eax //move eax in result
Strength reduction: add instead of multiply.
By analysis, fewer overflow checks (you can do better as you described).
Keep values in registers instead of going back to the argument on the stack.
Chose registers carefully so values that can be reused are not overwritten.
Upvotes: 8