Nico Bellic
Nico Bellic

Reputation: 363

Splitting std::string and inserting into a std::set

As per request of the fantastic fellas over at the C++ chat lounge, what is a good way to break down a file (which in my case contains a string with roughly 100 lines, and about 10 words in each line) and insert all these words into a std::set?

Upvotes: 5

Views: 14969

Answers (3)

Josh
Josh

Reputation: 444

Assuming you've read your file into a string, boost::split will do the trick:

#include <set>
#include <boost/foreach.hpp>
#include <boost/algorithm/string.hpp>

std::string astring = "abc 123 abc 123\ndef 456 def 456";  // your string
std::set<std::string> tokens;                              // this will receive the words
boost::split(tokens, astring, boost::is_any_of("\n "));    // split on space & newline

// Print the individual words
BOOST_FOREACH(std::string token, tokens){
    std::cout << "\n" << token << std::endl;
}

Lists or Vectors can be used instead of a Set if necessary.

Also note this is almost a dupe of: Split a string in C++?

Upvotes: 3

Drise
Drise

Reputation: 4388

#include <set>
#include <iostream>
#include <string>

int main()
{
  std::string temp, mystring;
  std::set<std::string> myset;

  while(std::getline(std::cin, temp))
      mystring += temp + ' ';
  temp = "";      

  for (size_t i = 0; i < mystring.length(); i++)
  {
    if (mystring.at(i) == ' ' || mystring.at(i) == '\n' || mystring.at(i) == '\t')
    {
      myset.insert(temp);
      temp = "";
    }
    else
    {
      temp.push_back(mystring.at(i));
    }
  }
  if (temp != " " || temp != "\n" || temp != "\t")
    myset.insert(temp);

  for (std::set<std::string>::iterator i = myset.begin(); i != myset.end(); i++)
  {
    std::cout << *i << std::endl;
  }
  return 0;
}

Let's start at the top. First off, you need a few variables to work with. temp is just a placeholder for the string while you build it from each character in the string you want to parse. mystring is the string you are looking to split up and myset is where you will be sticking the split strings.

So then we read the file (input through < piping) and insert the contents into mystring.

Now we want to iterate down the length of the string, searching for spaces, newlines, or tabs to split the string up with. If we find one of those characters, then we need to insert the string into the set, and empty our placeholder string, otherwise, we add the character to the placeholder, which will build up the string. Once we finish, we need to add the last string to the set.

Finally, we iterate down the set, and print each string, which is simply for verification, but could be useful otherwise.

Edit: A significant improvement on my code provided by Loki Astari in a comment which I thought should be integrated into the answer:

#include <set>
#include <iostream>
#include <string>

int main()
{
  std::set<std::string> myset;
  std::string word;

  while(std::cin >> word)
  {
      myset.insert(std::move(word));
  }

  for(std::set<std::string>::const_iterator it=myset.begin(); it!=myset.end(); ++it)
    std::cout << *it << '\n';
}

Upvotes: 2

Mooing Duck
Mooing Duck

Reputation: 66922

The easiest way to construct any container from a source that holds a series of that element, is to use the constructor that takes a pair of iterators. Use istream_iterator to iterate over a stream.

#include <set>
#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>

using namespace std;

int main()
{
  //I create an iterator that retrieves `string` objects from `cin`
  auto begin = istream_iterator<string>(cin);
  //I create an iterator that represents the end of a stream
  auto end = istream_iterator<string>();
  //and iterate over the file, and copy those elements into my `set`
  set<string> myset(begin, end);

  //this line copies the elements in the set to `cout`
  //I have this to verify that I did it all right
  copy(myset.begin(), myset.end(), ostream_iterator<string>(cout, "\n"));
  return 0;
}

http://ideone.com/iz1q0

Upvotes: 25

Related Questions