Kevin Mee
Kevin Mee

Reputation: 549

Regex split on comma but ignoring when inside text qualifier containing two characters

I've seen a lot of examples of splitting on a comma and ignoring those commas which are inside single or double quotes. I am looking for a similar thing, however instead of being single or double quote I need the text qualifier to be ~*

I attempted to modify some of the code I found that used double quote as a text qualifier but was unsuccessful. I am terrible with regex and have spent sometime today looking at the documentation to understand it so I could try to create an expression that would work for my use.

Is this possible to have two characters as the text qualifier?

example of one of the lines:

~* header1~*, ~* header2 ~*, ~* header3, value1 ~*

I am looking for the output to be:

 ~* header1~*, 
 ~* header2~*, 
 ~* header3,value1~*

var result = Regex.Split(line, ",(?=(?:[^']*'[^']*')*[^']*$)");

Upvotes: 0

Views: 436

Answers (3)

A.sharif
A.sharif

Reputation: 2017

Do this in two lines.

First replace all of the alone "~* " by using this expression "~\*\s", and replace it with a space, " ". (This gets rid of the ~* that aren't new lines)

Then secondly split on "~\*,"

EDIT:

You should be able to split using this expression "(?<=(~\*,))\s"

Upvotes: 1

Alexander Petrov
Alexander Petrov

Reputation: 14261

No need Split.

string input = "~* header1~*, ~* header2 ~*, ~* header3, value1 ~*";
string pattern = @"~\* \s* .+? \s* ~\*";
var matches = Regex.Matches(input, pattern, RegexOptions.IgnorePatternWhitespace);

Upvotes: 1

Jens R.
Jens R.

Reputation: 191

You can use one single regular expression to achieve the desired output:

/~\*(.*)~\*[,\s]*/gU

Each capturing group will then contain one string. Have a look a working example: https://www.regex101.com/r/zP3aM3/2

Upvotes: 1

Related Questions