Reputation: 2494
I'm creating a comma-separated script to store various pieces of data for a project.
The last part of my data line is always a summary of the preceding data (it is always a string of any character except newline characters). The problem is that I split the entire line along the commas, so if this summary portion of the line has commas in it, anything after the excess commas will be split as well, which I don't want.
So I'd like to make my own escape character for commas. I figure that the least error-prone way to do this is with regular expressions.
I've come up with the following expression, ^,(?!\\,)$
which I had hoped would look for commas, but not escaped commas. Unfortunately, it did not work.
The following two lines illustrate how my data is separated.
01, 0, 80.0, 0x00100204, 0x00000000, 0x00000800, 0xFFFFF800, 0.02, 0.5, Channel 01: Voltage Offset\,\,\,comma
02, 0, 80.0, 0x00100208, 0x00000000, 0x00000800, 0xFFFFF800, 0.02, 0.5, Channel 02: Voltage Offset
Note that in the first line of data, I have excess commas in there, denoted by \,\,\,comma
But when I call Regex.Split(line, @"^,(?!\,)$");
, nothing happens, I just get a single element array containing my entire string.
Upvotes: 1
Views: 961
Reputation: 30985
This is a good example to use negative lookbehind:
(?<!\\),
Upvotes: 1
Reputation: 89547
You can use this pattern that checks if there is not a backslash before:
Regex.Split(line, @"(?<!\\), ");
(?<!...)
is a lookbehind assertion and means: not preceded by
Upvotes: 1
Reputation: 51330
If you want to use regex, instead of splitting the string I'd suggest to capture the fields by matching the following regex:
\s*((?:\\.|[^\\])+?)\s*(?:,\s*|$)
Demo: http://regex101.com/r/lP8yE1/4
Each match will be a field, and the value will be the contents of capture group 1.
Upvotes: 1