Reputation: 6478

Is my implementation of fixed-point arithmetic correct?

A little while ago, I created a bunch of C macros for fixed-point values manipulation. Encouraged by several questions and answers on SO, I was hoping to get performance gains in a calculation-intensive part of my program. While the code seems to produce correct results, I'm wondering if it's not too naive/oversimplified, because it actually runs slower for me than regular floating-point versions of my routines (I'm doing bicubic image interpolation on Wintel). Could you please look at this short piece of code containing my macros and suggest some improvements, particularly with regard to performance? Thanks.

// these are architecture-dependent
typedef short int fixed16;
typedef int fixed32;
typedef __int64 fixed64;

// value of 2^n
#define POW2(n) (1 << n)

// create 16bit integer-based fixed point value from a floating point value, n is the number of bits reserved for the fractional part
#define FP_MAKE16(x, n) ((x) > 0.0 ? static_cast<fixed16>(floor((x) * POW2(n) + 0.5)) : static_cast<fixed16>(ceil((x) * POW2(n) - 0.5)))
// the same, 32bit
#define FP_MAKE32(x, n) ((x) > 0.0 ? static_cast<fixed32>(floor((x) * POW2(n) + 0.5)) : static_cast<fixed32>(ceil((x) * POW2(n) - 0.5)))
// and 64bit
#define FP_MAKE64(x, n) ((x) > 0.0 ? static_cast<fixed64>(floor((x) * POW2(n) + 0.5)) : static_cast<fixed64>(ceil((x) * POW2(n) - 0.5)))

// convert a fixed-point integer from one (n) format to another (m) assuming n < m
#define FP_CONVERT_UP(x, n, m) ((x) << (m-n))
// same for n > m
#define FP_CONVERT_DOWN(x, n, m) ((x) >> (n-m))

// convert floating-point value back to float
#define FP_FLOAT(x, n) (static_cast<float>(x) / POW2(n))
// same for double 
#define FP_DOUBLE(x, n) (static_cast<double>(x) / POW2(n))
// and for int. fractional part will be discarded! 
#define FP_INT(x, n) ((x) >> n)

// arithmetic operations for same-format numbers 
#define FP_NEG(a) ((~a)+1)
#define FP_ADD(a, b) ((a) + (b))
#define FP_SUB(a, b) ((a) + FP_NEG(b))
#define FP_MUL(a, b, n) (((a) * (b)) >> n)
#define FP_DIV(a, b, n) (((a) << n) / (b))
#define FP_POW2(a, n) (((a) * (a)) >> n)
#define FP_POW3(a, n) (((((a) * (a)) >> n)*(a)) >> n)

// arithmetic for different-format numbers, assuming n is the target (result) format and n > m
#define FP_ADD_UP(a, b, n, m) ((a) + ((b) << (n-m)))
#define FP_SUB_UP(a, b, n, m) ((a) + FP_NEG((b) << (n-m)))
#define FP_MUL_UP(a, b, n, m) (((a) * (b)) >> m)
#define FP_DIV_UP(a, b, n, m) (((a) << m) / (b))

// same for n < m
#define FP_ADD_DOWN(a, b, n, m) ((a) + ((b) >> (m-n)))
#define FP_SUB_DOWN(a, b, n, m) ((a) + FP_NEG((b) >> (m-n)))
#define FP_MUL_DOWN(a, b, n, m) (((a) * (b)) >> m)
#define FP_DIV_DOWN(a, b, n, m) (((a) << m) / (b))

EDIT: Basically, the answers and comments turned towards these two points:

The code is hideous, difficult to use and maintain: I wholehartedly agree and will take the time to encapsulate it inside a class. I had some concerns about performance which were adressed and I have been convinced they were unfounded and I went to all the trouble for nothing. :)
Fixed-point is not worth the trouble anyway: While that may be true on my platform, I was creating this to improve the execution speed of my code on a platform where there was no floating-point hardware. There, the floating point operations took too long and fixed was the way to go. I was only testing this on Intel because I have no access to the target hardware at the moment

While I am extremely grateful for the insight provided thus far, I was hoping to hear from somebody who actually did some calculations in fixed-point to tell me if these arithmetic operations are indeed the way to go. Perhaps there is some additional spiffy bit-twiddling that I am unaware of, that makes a difference in regard to performance or precision? In other words, if I am to encapsulate this code, can I keep the same arithmetic instructions in the inline operator functions basically the same as they are now, or should I change them somehow?

Upvotes: 1

Answers (4)

Robᵩ

Reputation: 168626

This is a response to @ neuviemeporte 's comments about performance. I'm making this an answer instead of comment so I can format the code more easily.

You said, "AFAIK they are actually implemented as functions which have call overhead", and "I'm guessing struct members also have to be referenced by some sort of pointer; you can't escape from 'this'". Both of those concerns are valid on their face, but let's investigate further.

I am using gcc on Linux/x86. Consider this program:

typedef int fixed32;
#define FP_ADD(a,b) ((a)+(b))
#define FP_NEG(a) ((~a)+1)
#define FP_SUB(a,b) ((a)+FP_NEG(b))
#define FP_INT(x,n) ((x)>>n)
#define FP_MUL(a,b,n) (((a)*(b))>>n)
#define FP_DIV(a,b,n) (((a)<<n)/(b))

template<class T, unsigned n>
struct Fixed {
private:
    T theBits;

public:
    Fixed(T t = T()) : theBits(t) {}

    Fixed operator+(const Fixed& rhs) const {
        return Fixed(theBits + rhs.theBits);
    }

    Fixed operator-(const Fixed& rhs) const {
        return Fixed(theBits - rhs.theBits);
    }

    Fixed operator~() const {
        return Fixed(~theBits);
    }

    Fixed operator*(const Fixed& rhs) const {
        return Fixed((theBits*rhs.theBits)>>n);
    }

    Fixed operator/(const Fixed& rhs) const {
        return Fixed((theBits<<n)/rhs.theBits);
    }

    operator T() const {
        return theBits >> n;
    }
};

int DoFpAdd(const fixed32 a, const fixed32 b) {
     fixed32 result = FP_ADD(a, b);
     return FP_INT(result, 16);
}

int DoFixedAdd(const Fixed<int, 16> a, const Fixed<int, 16> b) {
    return a+b;
}


int DoFpSub(const fixed32 a, const fixed32 b) {
     fixed32 result = FP_SUB(a, b);
     return FP_INT(result, 16);
}

int DoFixedSub(const Fixed<int, 16> a, const Fixed<int, 16> b) {
    return a-b;
}

int DoFpMul(const fixed32 a, const fixed32 b) {
     fixed32 result = FP_MUL(a, b, 16);
     return FP_INT(result, 16);
}

int DoFixedMul(const Fixed<int, 16> a, const Fixed<int, 16> b) {
    return a*b;
}

int DoFpDiv(const fixed32 a, const fixed32 b) {
     fixed32 result = FP_DIV(a, b, 16);
     return FP_INT(result, 16);
}
int DoFixedDiv(const Fixed<int, 16> a, const Fixed<int, 16> b) {
    return a/b;
}

I compiled it with this command line g++ -O4 -Wall -pedantic -ansi -S x.cc && c++filt <x.s > x.S.

It may surprise you to know that the similarly-named functions produced identical assembly language. Yes, FP_ADD() and Fixed<>::operator+ are the same. No function calls (it is all inlined) no this pointer, just instruction-for-instruction identical assembly language.

There is no difference in execution speed. There is a huge difference in usability, maintainability and readability. I recommend that you perform a similar experiment on whatever platform you are using, and switch to a class interface.

Upvotes: 4

Bo Persson

Reputation: 92261

You should probably read this paper from the C++ committee - "Technical Report on C++ Performance".

http://www.open-std.org/jtc1/sc22/wg21/docs/TR18015.pdf

It effectively kills some of the myths.

Upvotes: 2

Tugrul Ates

Reputation: 9687

You can find out by writing a unit test for your implementation. A simple realization can be achieved with successive assertations:

assert(FP_ADD(7, 5) == 12)
assert(FP_SUB(7, 5) == 2)

etc.

Cover enough use cases until you are confident with your code. Also, don't forget to compare doubles and floats within a small epsilon. Equality may not work as expected due to their bit representation limitations.

Upvotes: 2

Puppy

Reputation: 146910

Inline functions. Use them. Not macros. Use a class, overload it's operators. And static_cast does not exist in C. This code is so abysmally terrible, if you posted a code sample using it, it would be totally unreadable.

Remember that floating point operations are implemented in hardware, and the fixed-point operations you've implemented are in software. There's going to be a penalty for that change and it could easily be that your code simply isn't fast enough at the algorithmic level to overcome that change.

Upvotes: 5

Is my implementation of fixed-point arithmetic correct?

Answers (4)

Related Questions