Jishan
Jishan

Reputation: 1684

Delimiting Using Multiple Delimiters

So I have an arbitrarily long string that I take as an input from the user and I want to tokenise it and store it in a vector<std::string>. here is the code that I am using (which maybe inspired from my C background):

#include <iostream>
#include <vector>
#include <string>
#include <iterator>
#include <sstream>
#include <string.h>
using namespace std;

int main()
{
    string input;
    cout << "Input a \' \' or \',\' or \'\\r\' separated string:  ";
    cin >> input;

    vector<string> tokens;

    char *str = new char[input.length() + 1];
    strcpy(str, input.c_str());
    char * pch;
    pch = strtok(str, " , \r");
    while (pch != NULL)
    {
        tokens.push_back(pch);
        pch = strtok(NULL, " , \r");
    }

    for (vector<string>::const_iterator i = tokens.begin(); i != tokens.end(); ++i)
        cout << *i << ' ';
    return 0;
}

However, this only tokenizes the first word and nothing after that, like viz:

Input a ' ' or ',' or '\r' string:  hello, world I am C.
hello

What am I doing wrong and what would be the correct way to do it without using third party library? Regards.

Upvotes: 2

Views: 419

Answers (1)

Sam Varshavchik
Sam Varshavchik

Reputation: 118292

This is, sadly, a quite common pitfall. Many introductory courses and books on C++ teach you to accept interactive input like this:

cin >> input;

Many introductory simple exercises typically prompt for a single value of some sort, and that works fine, for that use case.

Unfortunately, these books don't fully explain what >> actually does, and what it does, really, is strip whitespace from input, and only process input up until the next whitespace. Even when input is a string.

So, when you enter a whole line of text, only the first word is read into input. The solution is to use the right tool, for the right job: std::getline(), which reads a single line of text, and puts it into a single string variable:

getline(cin, input);

Upvotes: 3

Related Questions