user14043526
user14043526

Reputation:

C++ separate string by selected commas

I was reading the following question Parsing a comma-delimited std::string on how to split a string by a comma (Someone gave me the link from my previous question) and one of the answers was:

stringstream ss( "1,1,1,1, or something else ,1,1,1,0" );
vector<string> result;

while( ss.good() )
{
    string substr;
    getline( ss, substr, ',' );
    result.push_back( substr );
}

But what if my string was like the following, and I wanted to separate values only by the bold commas and ignoring what appears inside <>?

<a,b>,<c,d>,,<d,l>,

I want to get:

<a,b>

<c,d>

"" //Empty string

<d,l>

""
  1. Given:<a,b>,,<c,d> It should return: <a,b> and "" and <c,d>

  2. Given:<a,b>,<c,d> It should return:<a,b> and <c,d>

  3. Given:<a,b>, It should return:<a,b> and ""

  4. Given:<a,b>,,,<c,d> It should return:<a,b> and "" and "" and <c,d>

In other words, my program should behave just like the given solution above separated by , (Supposing there is no other , except the bold ones)


Here are some suggested solution and their problems:

Delete all bold commas: This will result in treating the following 2 inputs the same way while they shouldn't

<a,b>,<c,d>

<a,b>,,<c,d>

Replace all bold commas with some char and use the above algorithm: I can't select some char to replace the commas with since any value could appear in the rest of my string

Upvotes: 1

Views: 293

Answers (5)

John
John

Reputation: 827

Seems quite straight forward to me.

vector<string> customSplit(string s)
{
    vector<string> results;
    int level = 0;
    std::stringstream ss;
    for (char c : s)
    {
        switch (c)
        {
            case ',':
                if (level == 0)
                {
                    results.push_back(ss.str());
                    stringstream temp;
                    ss.swap(temp); // Clear ss for the new string.
                }
                else
                {
                    ss << c;
                }
                break;
            case '<':
                level += 2;
            case '>':
                level -= 1;
            default:
                ss << c;
        }
    }

    results.push_back(ss.str());
    return results;
}

Upvotes: 0

Serge Ballesta
Serge Ballesta

Reputation: 148950

Here is a possible code that scans a string one char at a time and splits it on commas (',') unless they are masked between brackets ('<' and '>').

Algo:

assume starting outside brackets
loop for each character:
   if not a comma, or if inside brackets
       store the character in the current item
       if a < bracket: note that we are inside brackets
       if a > bracket: note that we are outside brackets
   else (an unmasked comma)
       store the current item as a string into the resulting vector
       clear the current item
store the last item into the resulting vector

Only 10 lines and my rubber duck agreed that it should work...

C++ implementation: I will use a vector to handle the current item because it is easier to build it one character at a time

std::vector<std::string> parse(const std::string& str) {
    std::vector<std::string> result;
    bool masked = false;
    std::vector<char> current;        // stores chars of the current item
    for (const char c : str) {
        if (masked || (c != ',')) {
            current.push_back(c);
            switch (c) {
            case '<': masked = true; break;
            case '>': masked = false;
            }
        }
        else {            // unmasked comma: store item and prepare next
            current.push_back('\0');  // a terminating null for the vector data
            result.push_back(std::string(&current[0]));
            current.clear();
        }
    }
    // do not forget the last item...
    current.push_back('\0');
    result.push_back(std::string(&current[0]));
    return result;
}

I tested it with all your example strings and it gives the expected results.

Upvotes: 0

Timmel
Timmel

Reputation: 17

I think what you want is something like this:

vector<string> result;
string s = "<a,b>,,<c,d>"
int in_string = 0;
int latest_comma = 0;

for (int i = 0; i < s.size(); i++) {
    if(s[i] == '<'){
        result.push_back(s[i]);
        in_string = 1;
        latest_comma = 0;
    }
    else if(s[i] == '>'){
        result.push_back(s[i]);
        in_string = 0;
    }
    else if(!in_string && s[i] == ','){
        if(latest_comma == 1)
            result.push_back('\n');
        else
            latest_comma = 1;
    }            
    else
        result.push_back(s[i]);
}

Upvotes: 0

kesarling
kesarling

Reputation: 2216

Adding to @Carlos' answer, apart from regex (take a look at my comment); you can implement the substitution like the following (Here, I actually build a new string):

#include <algorithm>
#include <iostream>
#include <string>

int main() {
    std::string str;
    getline(std::cin,str);
    std::string str_builder;
    for (auto it = str.begin(); it != str.end(); it++) {
        static bool flag = false;
        if (*it == '<') {
            flag = true;
        }
        else if (*it == '>') {
            flag = false;
            str_builder += *it;
        }
        if (flag) {
            str_builder += *it;
        }
    }
}

Upvotes: 1

Carlos
Carlos

Reputation: 6021

Why not replace one set of commas with some known-to-not-clash character, then split it by the other commas, then reverse the replacement?

So replace the commas that are inside the <> with something, do the string split, replace again.

Upvotes: 0

Related Questions