CharlesLiuChina
CharlesLiuChina

Reputation: 271

program crash in std::sort() sometimes, can't reproduce

Description:

My program crash sometimes in std::sort(), I write a minimal program to reproduce this situation, but everything is just alright. Here is the minimal example:

typedef struct st {
    int it;
    char ch;
    char charr[100];
    vector<string *> *vs;
} st;

bool function(st *&s1, st *&s2) {
    static int i = 1;
    cout<<i<<" "<<&s1<<" "<<&s2<<endl;
    ++i;
    return s1->it > s2->it;
}

int main(int argc, char **argv) {
    vector<st *> ar;
    for (int i = 0; i < 100; ++i) {
        st *s = new st;
        s->it = urandom32();
        ar.push_back(s);
    }

    ar.clear();

    for (int i = 0; i < 100; ++i) {
        st *s = new st;
        s->it = urandom32();
        ar.push_back(s);
    }

    sort(ar.begin(), ar.end(), function);

    return 0;
}

Here is the GDB stack info:

  • 0 0x00007f24244d9602 in article_cmp (cand_article_1=0x7f23fd297010, cand_article_2=0x4015) at src/recom_frame_worker.h:47
  • 1 0x00007f24244fc41b in std::__unguarded_partition<__gnu_cxx::__normal_iterator > >, cand_article*, bool ()(cand_article, cand_article*)> (__first=, __last=, __pivot=@0x7f230412b350: 0x7f23fd297010, __comp=0x7f24244d95e1 ) at /usr/include/c++/4.8.3/bits/stl_algo.h:2266
  • 2 0x00007f24244f829c in std::__unguarded_partition_pivot<__gnu_cxx::__normal_iterator > >, bool ()(cand_article, cand_article*)> (__first=, __last=, __comp=0x7f24244d95e1 ) at /usr/include/c++/4.8.3/bits/stl_algo.h:2296
  • 3 0x00007f24244f1d88 in std::__introsort_loop<__gnu_cxx::__normal_iterator > >, long, bool ()(cand_article, cand_article*)> (__first=, __last=, __depth_limit=18, __comp=0x7f24244d95e1 ) at /usr/include/c++/4.8.3/bits/stl_algo.h:2337
  • 4 0x00007f24244ed6e5 in std::sort<__gnu_cxx::__normal_iterator > >, bool ()(cand_article, cand_article*)> ( __first=, __last=, __comp=0x7f24244d95e1 ) at /usr/include/c++/4.8.3/bits/stl_algo.h:5489

article_cmp is called in sort(article_result->begin(), article_result->end(), article_cmp); and article_result is a vector<cand_article*> *. cand_article is a struct.
Here is the definition of article_cmp:

bool article_cmp(cand_article* cand_article_1, cand_article* cand_article_2) {
    return cand_article_1 -> display_time >= cand_article_2 -> display_time;
}

Here is a piece of code where the crash happens:

 article_result->clear();
 for(vec_iter = _channel_data -> begin(); vec_iter != _channel_data -> end(); vec_iter++) {
     cand_article* cand = to_cand_group(*vec_iter);
     if(cand == NULL) continue;
     // refresh open loadmore
     if(m_request.req_type == 1) {
         if(cand -> display_time > m_request.start){
             article_result->push_back(cand);
         }
     }else if(m_request.req_type == 2){
         if(cand -> display_time < m_request.end){
             article_result->push_back(cand);
         }
     }else{
         article_result->push_back(cand);
     }
 }

 sort(article_result->begin(), article_result->end(), article_cmp);

Question:

I don't know how to handle this kind of coredump, cause 0x4015 is a kernel space address? Any suggestions on how to fix this kind of bug? sorry, I can't reproduce this situation with a minimal program. And this happened in a single thread, so you don't need to think about multi-thread situation.

Upvotes: 2

Views: 1253

Answers (2)

Nemo Vu
Nemo Vu

Reputation: 26

  1. Your minimal program is making memory leaks. Because it just removes all the items from the list but did not release the memory used by them. In the case your items are big enough, your program might get crashed after eating up all the memory. That's why your minimal program is still okay, because the items there are very small.

  2. I would change your program to:

    typedef struct st { int it; char ch; char charr[100]; vector *vs; } st;

    bool function(st *&s1, st *&s2) { static int i = 1; cout<it > s2->it; }

    int main(int argc, char **argv) { vector ar; for (int i = 0; i < 100; ++i) { st *s = new st; s->it = urandom32(); ar.push_back(s); }

    release all the memory used my ar's items first

    for (vector::iterator it = ar.begin(); it != ar.end(); ++it) delete *it;

    ar.clear();

    for (int i = 0; i < 100; ++i) {
        st *s = new st;
        s->it = urandom32();
        ar.push_back(s);
    }
    
    sort(ar.begin(), ar.end(), function);
    
    return 0;
    

    }

Upvotes: 0

The rule is "if std::sort crashes, you have an invalid comparison function". Your comparison function is:

bool article_cmp(cand_article* lhs, cand_article* rhs) {
    return lhs -> display_time >= rhs -> display_time;
}

This is not a strict weak ordering. In particular, if the display times are equal it returns true, which means that if you swap the arguments it will still return true ... and that is not allowed. You need:

bool article_cmp(cand_article* lhs, cand_article* rhs) {
    return lhs -> display_time > rhs -> display_time;
}

The reason your simplified example works (congratulations for at least trying to simplify), is that you simplified the comparison function so it is valid. If the return statement was return s1->it >= s2->it;, and you used a smaller range of values, it too would probably crash.

Incidentally, a much more natural C++ declaration of your example structure would look like:

struct st {                       // No need for that typedef in C++
    int it;
    char ch;
    std::string charr;            // ... or *possibly* std::array<char,100>.
    std::vector<std::string> vs;  // Strings and vectors best held by value
};

Also note that I have actually used the std:: prefix.

Upvotes: 7

Related Questions