dani
dani

Reputation: 3887

Regular expression for comma-separation except if the comma is within parenthesis

I need to separate a string like this:

cat, dog , ant( elephant, lion(tiger)), bird

into this:

cat
dog
ant( elephant, lion(tiger))
bird

My current state is this: (\w+)(,\s*)*, but that also separates elephant, lion and tiger. Further, some commas and spaces are kept.

You might have guessed, that I will call the same expression again on the ant(...) string in a further iteration. If important, I'll use this in c++.

Upvotes: 3

Views: 1365

Answers (1)

wally
wally

Reputation: 11002

This regex:

(\w+\(.+\))|\w+

Will parse

cat, dog , ant( elephant, lion(tiger)), bird

Into:

cat
dog
ant( elephant, lion(tiger))
bird

Full program:

#include <string>
#include <vector>
#include <iterator>
#include <regex>
#include <iostream>

int main()
{
    std::string str{R"(cat, dog , ant( elephant, lion(tiger)), bird)"};
    std::regex r{R"((\w+\(.+\))|\w+)"};

    std::vector<std::string> result{};
    auto it = std::sregex_iterator(str.begin(), str.end(), r);
    auto end = std::sregex_iterator();
    for(; it != end; ++it) {
        auto match = *it;
        result.push_back(match[0].str());
    }
    std::cout << "Input string: " << str << '\n';
    std::cout << "Result:\n";
    for(auto i : result)
        std::cout << i << '\n';
}

live demo

Upvotes: 3

Related Questions