Edge
Edge

Reputation: 2540

C++ Tokenize String

I'm looking for a simple way to tokenize std::string input without using non default libraries such as Boost, etc.

For example, if the user enters forty_five, I would like to separate 'forty' and 'five' using the '_' as the delimiter.

Upvotes: 15

Views: 18250

Answers (4)

Amit
Amit

Reputation: 1160

C++20

#include <string>
#include <ranges>
#include <algorithm>
#include <iostream>

int main()
{
    const std::string input{ "C++20 Tokenization Example" };

    for (const auto& token_range : input | std::views::split(' ')) {
        std::string token{};
        std::ranges::copy(token_range, std::back_inserter(token));
        std::cout << token << std::endl;
    }
}

Outpot:

C++20
Tokenization
Example

Demo

Upvotes: 2

Jaime Ivan Cervantes
Jaime Ivan Cervantes

Reputation: 3697

Look at this tutorial, which is by far the best tutorial on tokenization that I have found so far. It covers the best practices in the implementation of different methods that include using getline() and find_first_of() in C++ std, and strtok() in C.

Upvotes: 0

Mahmoud Al-Qudsi
Mahmoud Al-Qudsi

Reputation: 29579

To convert a string to a vector of tokens (thread safe):

std::vector<std::string> inline StringSplit(const std::string &source, const char *delimiter = " ", bool keepEmpty = false)
{
    std::vector<std::string> results;

    size_t prev = 0;
    size_t next = 0;

    while ((next = source.find_first_of(delimiter, prev)) != std::string::npos)
    {
        if (keepEmpty || (next - prev != 0))
        {
            results.push_back(source.substr(prev, next - prev));
        }
        prev = next + 1;
    }

    if (prev < source.size())
    {
        results.push_back(source.substr(prev));
    }

    return results;
}

Upvotes: 27

Adam Liss
Adam Liss

Reputation: 48330

You can use the strtok_r function, but read the man pages carefully so you understand how it maintains state.

Upvotes: 0

Related Questions