Reputation: 431
I tried a few Google searches before making this post, but to be honest I don't know what to search for. I have a C++ project and have been happily going about using the GNU compilers (g++). Today I tried to compile with clang++ and got a segfault.
Fine, ok, I can deal with this. After perusing my code and printing some stuff I was able to fix the problem. However the solution deeply troubles and confuses me.
Here's the situation: I'm using a tree-like data structure that stores a class called Ligament, but I'm storing it in a std::vector. I do this by storing a vector of "children" which are really just integer offsets between parent and child within the vector. In this way I can access children by using the this pointer, i.e
child = this[offset];
However, none of that's important. Here's this issue: I have an Ligament::addChild(int) function that takes an integer and pushes it to the back of a vector that is a member of Ligament:
void Ligament::addChild(uint32_t offset){
children.push_back(offset);
}
Very simple stuff. In general I pass to addChild an argument that gets returned from a recursive function called fill:
//starting at root
uint32_t fill(vector<Ligament>& lVec, TiXmlElement * el){
//store current size here, as size changes during recursion
uint32_t curIdx = lVec.size();
lVec.push_back(createLigament());
//Add all of this Ligament's children
TiXmlElement * i = el->FirstChildElement("drawable");
for (; i; i=i->NextSiblingElement("drawable")){
uint32_t tmp = fill(lVec, i) - curIdx;
lVec[curIdx].addChild(tmp);
//Does not work in clang++, but does in g++
//lVec[curIdx].addChild(fill(lVec,i)-curIdx);
}
//return the ligament's index
return curIdx;
}
The fill function gets called on an XML element and goes through its children, depth first.
Sorry if all that was unclear, but the core of the problem seems to be what's in that for loop. For some reason I have to store the return value of the fill call in a variable before I send it to the addChild function.
If I don't store it in a temporary variable, it seems as though the addChild function does not change the size of children, but I can't imagine why.
To check all this I printed out the size of the children vector before and after these calls, and it never went above 1. Only when I called addChild with a value that wasn't directly returned from a function did it seems to work.
I also printed out the values of offset inside the addChild function as well as inside the for loop before it was called. In all cases the values were the same, both in clang++ and in g++.
Since the issue is resolved I was able to move forward, but this is something I'd expect to work. Is there something I'm doing wrong?
Feel free to yell at me if I could do more to make this question clearer.
ALSO: I realize now that passing lVec around by reference through these recursions may be bad, as a push_back call may cause the address to change. Is this a legitimate concern?
EDIT:
So as people have pointed out, my final concern turned out to be related to the issue. The fill call has the potential to resize the vector, while the lVec[curIdx] = modifier will change an element in the vector. The order in which these things occurs can have drastic consequences.
As a follow up, is using the tmp variable acceptable? There's still the issue of a reallocation occuring...I think I will use SHR's suggestion of a map, then convert it to a vector when all is said and done.
Upvotes: 3
Views: 92
Reputation: 42544
// Does not work in clang++, but does in g++:
lVec[curIdx].addChild(fill(lVec,i)-curIdx);
The bug you are seeing is due to dependence on order of evaluation. Since fill(lVec, i)
may cause lVec
to reallocate its elements, the program will have undefined behavior if lVec[curIdx]
is evaluated before fill(lVec,i)
. The order of evaluation of function arguments - and the postfix expression that determines which function to call - is unspecified.
Upvotes: 5
Reputation: 8313
I think it is undefined behavior.
you push into vector, and change it in the same command.
one compiler may do the fill
first and the other may get lVec[curIdx]
first.
if it is the case it will work for both compilers when you use map<uint32_t,uint32_t>
instead of the vector
. since map doesn't require the memory to be sequential.
Upvotes: 2