Minighost
Minighost

Reputation: 63

Segmentation fault when using vector.push_back inside of certain loops

I'm currently using g++ on a Cygwin terminal as per my professor's request.

I'm supposed to take in an input file and read it word by word, then place all words inside a vector, sorted alphabetically and with no duplicates.

However, every time I try to manipulate my vector (i.e. - push_back) inside of certain loops, my program just segmentation faults.

Here's a snippet of my code:

void word_count(ifstream& input){
    string temp;
    vector<string> v;

    input >> temp; //set first variable
    v.push_back(temp);

    while (!input.eof()) { //I'm aware of the limitations while using !eof, this is just the way I am required to loop over a file
        input >> temp;

        for (vector<string>::iterator i = v.begin(); i != v.end(); i++) { //check entire vector for word
            if (*i == temp) { //just break and skip the word if it already exists
                break;
            }
            if (i == v.end() - 1) { //if the word doesn't exist yet
                for (vector<string>::iterator k = v.begin(); k != v.end(); k++) { //re-search the vector for the proper place
                    if (k == v.end() - 1) { //if at the end, just push_back the vector
                        v.push_back(temp); //Causes segmentation fault
                        break;
                    }
                    if ((*k < temp) && (*(k + 1) > temp)) { //find correct place and insert the word in the vector
                        v.insert(k, temp); //Also causes segmentation fault if execution even manages to get this far
                    }
                }
            }
        }
    }
}

The first push_back on line 5 is perfectly fine, I can copy and paste that multiple times without error. I can also push_back right after input >> temp (inside of the while loop) without error. However, if I try push_back under the 'k' loop, it segmentation faults. I'm totally stumped.

I've tried looking at the other vector related questions here on StackOverflow, but I don't really understand why I can (or cannot) use push_back in certain places.

Thanks for any help in advance!

Edit 1: I should mention that I tested it in VS 2019. The vector library file popped up, stating a "read-access-violation" exception was thrown. No segmentation faults (Or maybe that's VS's way of telling me a segmentation fault occurred?)

Edit 2: Modifying a vector invalidates iterators. I didn't know that, thanks for the help everyone!

Edit 3: I'm only allowed to use vectors, not sets or other containers. If I could use a set, I totally would.

Upvotes: 0

Views: 2226

Answers (3)

molbdnilo
molbdnilo

Reputation: 66371

Modifying a vector as you're iterating over it may invalidate iterators, and then anything can happen.

But you're overcompicating things - since the vector is ordered, you don't need to first see if the string exists and then search for the correct position, you can look for the position directly.
(That you don't need to search twice is one of the discoveries you're supposed to make during this exercise.)

I would (since you're probably not supposed to use any functions from <algorithm> or such "advanced" features)

  • Break the loop when you reach the end or when you have found the item,
  • If you find an element greater than the item, you should insert before that position and stop.
    Luckily, insert takes an iterator to insert before, so you can use i.

Something like this:

for (vector<string>::iterator i = v.begin(); i != v.end() && *i != temp; ++i)
{
    if (*i > temp)
    {
        v.insert(i, temp);
        break;
    }
}

Note that the break means that i is not used for any comparisons after the insert, so the insertion is safe.

Upvotes: 1

MadScientist
MadScientist

Reputation: 3460

As mentioned, you could use a std::set to store your unique words. You could populate it like this:

std::set<std::string> set_of_words(std::ifstream & input)
{
  std::set<std::string> words;

  std::string word;
  while (input >> word)
  {
    words.insert(word);
  }

  return words;
}

or you can use a std::vector as in your question. Using std::lower_bound from <algorithm> you could use it like this:

std::vector<std::string> vector_of_words(std::ifstream & input)
{
  std::vector<std::string> words;

  std::string word;
  while (input >> word)
  {
    auto pos = std::lower_bound(words.begin(), words.end(), word);
    if (pos == words.end())
    {
      words.push_back(word);
    }
    else
    {
      if (*pos != word)
      {
        words.insert(pos, word);
      }
    }
  }

  return words;
}

Upvotes: 0

Marek R
Marek R

Reputation: 37697

When you modify vector iterators become invalid.

There are two reason:

  • when you push_back and std::vector::capacity is busted new block data is allocated and data are moved/copied to new buffer
  • when you add/remove items in middle old iterator can point to different item possibly not existing anymore.

There is quick way to fix it. When you do a modification you have to fetch updated value of iterator. poush_back doesn't have such functionality, but std::vector::insert returns iterator to new value and this iterator can be use to update for loop iterator.

I could fix you code, but it is so convoluted (to much indentation) that I wish to avoid that. You should first slice this code to smaller functions.

Instead salvage your code, here is my version:

template<typename Iter>
size_t count_unique_items(Iter begin, Iter end)
{
    using value_type = typename std::iterator_traits<Iter >::value_type;
    std::unordered_set<value_type> unique_items;

    std::copy(begin, end, std::inserter(unique_items, unique_items.end()));

    return unique_itmes.size();
}

size_t count_unique_words(std::istream& input)
{
    return count_unique_items(std::istream_iterator<std::string>{input}, {});
}

https://wandbox.org/permlink/bHji7JZoB7E9ZoLn

Upvotes: 2

Related Questions