selbie
selbie

Reputation: 104589

Handling __sync_add_and_fetch not being defined

In my open source software project, I call the gcc atomic builtins: __sync_add_and_fetch and __sync_sub_and_fetch to implement atomic increments and decrements on certain variables. I periodically get an email from someone trying to compile my code, but they get the following linker error:

refcountobject.cpp:(.text+0xb5): undefined reference to `__sync_sub_and_fetch_4'
refcountobject.cpp:(.text+0x115): undefined reference to `__sync_add_and_fetch_4'

After some digging, I narrowed down the root cause to the fact that their older version of gcc (4.1) defaults to a target architecture of i386. And evidently, gcc doesn't actually have an intrinsic for atomic addition on 80386, so it implicitly injects an undefined __sync_add_and_fetch_4 call in it place. A great description of how this works is here.

The easy workaround, as discussed here, is to tell them to modify the Makefile to append -march=pentium as one of the compiler flags. And all is good.

So what's the long term fix so users don't have to manually fix the Makefile?

I am considering a few ideas:

I don't want to hardcode -march=pentium as a compiler flag into the Makefile. I'm guessing that will break on anything that isn't Intel based. But I could certainly could add it if the Makefile had a rule to detect that the default target was i386. I'm thinking about having a rule in the Makefile that is a script that calls gcc -dumpmachine and parses out the first triplet. If the string is i386, it would add the compiler flag. I'm assuming no one will be actually be building for 80386 machines.

The other alternative is to actually supply an implementation for __sync_add_and_fetch_4 for the linker to fall back on. It could even be compiled conditionally based on the presence of GCC_HAVE_SYNC_COMPARE_AND_SWAP macros being defined. I prototyped an implementation with a global pthread_mutex. Likely not the best performance, but it works and resolves the issue nicely. A better idea might be to write the inline assembly myself to call "lock xadd" for the implementation if compiling for x86.

Upvotes: 2

Views: 4809

Answers (2)

selbie
selbie

Reputation: 104589

This is my other working solution. It might have it's place in certain situations, but I opted for the makefile+script solution above.

This solution is to provide local definitions for _sync_add_and_fetch_4, _sync_fetch_and_add_4, _sync_sub_and_fetch_4, and _sync_fetch_and_sub_4 in a separate source file. They get linked in only if the compiler couldn't natively generate them. Some assembly required, but Wikipedia of all places had a reasonable implementation that I could reference. (I also disassembled what the compiler normally generates to infer if everything else was correct).

#if defined(__i386) || defined(i386) || defined(__i386__)
extern "C" unsigned int xadd_4(volatile void* pVal, unsigned int inc)
{

    unsigned int result;
    unsigned int* pValInt = (unsigned int*)pVal;

    asm volatile( 
        "lock; xaddl %%eax, %2;"
        :"=a" (result) 
        : "a" (inc), "m" (*pValInt) 
        :"memory" );

    return (result);

}

extern "C" unsigned int __sync_add_and_fetch_4(volatile void* pVal, unsigned int inc)
{
    return (xadd_4(pVal, inc) + inc);
}

extern "C" unsigned int __sync_sub_and_fetch_4(volatile void* pVal, unsigned int inc)
{
    return (xadd_4(pVal, -inc) - inc);
}

extern "C" unsigned int __sync_fetch_and_add_4(volatile void* pVal, unsigned int inc)
{
    return xadd_4(pVal, inc);
}

extern "C" unsigned int __sync_fetch_and_sub_4(volatile void* pVal, unsigned int inc)
{
    return xadd_4(pVal, -inc);
}

#endif

Upvotes: 1

selbie
selbie

Reputation: 104589

With no replies, I struck it out on my own to solve.

There are two possible solutions this is one of them.

First, add the following script, getfixupflags.sh, to the same directory as the Makefile. This script will detect if the compiler is likely targeting i386, and if so will echo out "-march=pentium" as output.

#!/bin/bash

_cxx=$1
_fixupflags=
_regex_i386='^i386'

if [[  ! -n $_cxx ]]; then echo "_cxx var is empty - exiting" >&2; exit; fi

 _target=`$_cxx -dumpmachine`
if [[ $_target =~ $_regex_i386 ]]; then 
    _fixupflags="$_fixupflags -march=pentium"
fi

if [[ -n $_fixupflags ]]; then echo $_fixupflags; fi

Now fix the Makefile to use this script. Add the following line to the Makefile

FIXUP_FLAGS := $(shell getfixupflags.sh $(CXX))

Then modify the compiler directives in the Makefile to include the FIXUP_FLAGS when compiling code. For example:

%.o: %.cpp
    $(COMPILE.cpp) $(FIXUP_FLAGS) $^

Upvotes: 0

Related Questions