Reputation: 16709

Help improving a simple assembly function

I just handed in this function in an assignment. It is done (hence no homework tag). But I would like to see how this can be improved.

Essentially, the function sums the squares of all the integers between 1 and the given number, using the following formula:

n(n+1)(2n+1)/6

Where n is the maximum number.

The function below is made to catch any overflow and return 0 should any occur.

UInt32 sumSquares(const UInt32 number)
{
    int result = 0;
    __asm
    {
        mov eax, number  //move number in eax
        mov edx, 2       //move 2 in edx
        mul edx          //multiply (2n)
        jo end           //jump to end if overflow
        add eax, 1       //addition (2n+1)
        jo end           //jump to end if overflow
        mov ecx, eax     //move (2n+1) in ecx

        mov ebx, number  //move number in ebx
        add ebx, 1       //addition (n+1)
        jo end           //jump to end if overflow

        mov eax, number //move number in eax for multiplication
        mul ebx         //multiply n(n+1)
        jo end          //jump to end if overflow
        mul ecx         //multiply n(n+1)(2n+1)
        jo end          //jump to end if overflow
        mov ebx, 6      //move 6 in ebx
        div ebx         //divide by 6, the result will be in eax

        mov result, eax //move eax in result

end:
    }

    return result;
}

Basically, I want to know what I can improve in there. In terms of best-practices mostly. One thing sounds obvious: smarter overflow check (with a single check for whatever maximum input would cause an overflow).

Upvotes: 5

Answers (3)

Sparafusile

Reputation: 4966

UInt32 sumSquares(const UInt32 number)
{
  __asm
  {
    mov eax, number     // n
    cmd eax, MAX_VALUE
    jg  bad_value

    lea ebx, [eax+1]    // n + 1
    lea ecx, [2*eax+1]  // 2n + 1

    mul ebx
    mul ecx

    shr eax, 1          // divide by 2
    mov ebx, 2863311531 // inverse of 3
    mul ebx             // divide by 3

    ret

    bad_value:
    xor eax, eax        // return 0
    ret
  }
}

http://blogs.msdn.com/devdev/archive/2005/12/12/502980.aspx

Spara

Upvotes: 3

Matthew T. Staebler

Reputation: 4996

mov eax, number    ; = n
cmp eax, 0x928     ; cannot handle n >= 0x928
jnc end 
shl eax, 1         ; = n(2)
add eax, 3         ; = n(2)+3
mov ebx, number
mul ebx            ; = n(n(2)+3)
add eax, 1         ; = n(n(2)+3)+1
mov ebx, number
mul ebx            ; = n(n(n(2)+3)+1) = n(n+1)(2n+1)
mov ebx, 6
div ebx
mov result, eax

Rather than checking for overflow, this solution checks the input against the known maximum value that the function can handle. Note that the last multiplication is allowed to overflow, and it will overflow for any input number greater than 0x509. Checking against a known value rather than relying on overflow checks allows the function to handle almost twice as many input values. In fact, the function is able to handle every input whose result fits within 32 bits.

Upvotes: 3

Doug Currie

Reputation: 41220

    mov eax, number  //move number in eax
    mov ecx, eax     //dup in ecx
    mul ecx          //multiply (n*n)
    jo end           //jump to end if overflow
    add eax, ecx     //addition (n*n+n); can't overflow
    add ecx, ecx     //addition (2n); can't overflow
    add ecx, 1       //addition (2n+1); can't overflow
    mul ecx          //multiply (n*n+n)(2n+1)
    jo end           //jump to end if overflow
    mov ecx, 6       //move 6 in ebx
    div ecx          //divide by 6, the result will be in eax

    mov result, eax //move eax in result

Strength reduction: add instead of multiply.

By analysis, fewer overflow checks (you can do better as you described).

Keep values in registers instead of going back to the argument on the stack.

Chose registers carefully so values that can be reused are not overwritten.

Upvotes: 8

Help improving a simple assembly function

Answers (3)

Related Questions