Impostor
Impostor

Reputation: 2050

Split CSV with delimiter and text quantifier

I have a row like

string row = "1;\"2\";\"3;4\";\"5\"6\";\"7;\"8\";9\"";

now I want to split the row into this result

[1],[2],[3;4],[5"6],[7;"8],[9"]

Delimiter: ; Quantifier: "

unfortunately the [5"6],[7;"8] are merged togeter

Code

public static IEnumerable<string> SplitCSV(string input, char separator, char quotechar)
{
    StringBuilder sb = new StringBuilder();
    bool escaped = false;
    for (int i = 0; i < input.Length; i++)
    {
        if (input[i] == separator && !escaped)
        {
            yield return sb.ToString();
            sb.Clear();
        }
        else if (input[i] == separator && escaped)
        {
            sb.Append(input[i]);
        }
        else if (input[i] == quotechar)
        {
            escaped = !escaped;
            sb.Append(input[i]);
        }
        else
        {
            sb.Append(input[i]);
        }
    }
    yield return sb.ToString();
}

Is there a mistake in my code or is the input invalid according to csv convention?

Update: Please avoid sugestions about third party librarys

Upvotes: 0

Views: 206

Answers (2)

renklus
renklus

Reputation: 843

You mismatched the quotechars.
See the actual groups below

Input                "1;\"2\";\"3;4\";\"5\"6\";\"7;\"8\";9\"";
Groups                  └───┘ └─────┘ └───┘ └───┘  └───┘  └─
Remaining separators   ;     ;       ;            ;     ;
Result                1,  2  , 3  4  ,  5  6     7,  8  ,9

Upvotes: 2

casiosmu
casiosmu

Reputation: 827

I think the problem is here, that you use the quote char also inside a quote. In your if/else you don't have taken care of this! Instead with each quote char you toggle escaped.

What about this:

else if (input[i] == quotechar)
{
    if (i+1<input.Length && input[i+1]==separator)
        escaped = !escaped;
    else
        sb.Append(input[i]);
}

BTW: If the original CSV string is correct I cannot say -.-

Upvotes: 2

Related Questions