audiFanatic
audiFanatic

Reputation: 2494

Parsing for escape characters with a regular expression

I'm creating a comma-separated script to store various pieces of data for a project.

The last part of my data line is always a summary of the preceding data (it is always a string of any character except newline characters). The problem is that I split the entire line along the commas, so if this summary portion of the line has commas in it, anything after the excess commas will be split as well, which I don't want.

So I'd like to make my own escape character for commas. I figure that the least error-prone way to do this is with regular expressions.

I've come up with the following expression, ^,(?!\\,)$ which I had hoped would look for commas, but not escaped commas. Unfortunately, it did not work.

The following two lines illustrate how my data is separated.

01, 0, 80.0, 0x00100204, 0x00000000, 0x00000800, 0xFFFFF800, 0.02, 0.5, Channel 01: Voltage Offset\,\,\,comma 
02, 0, 80.0, 0x00100208, 0x00000000, 0x00000800, 0xFFFFF800, 0.02, 0.5, Channel 02: Voltage Offset

Note that in the first line of data, I have excess commas in there, denoted by \,\,\,comma

But when I call Regex.Split(line, @"^,(?!\,)$");, nothing happens, I just get a single element array containing my entire string.

Upvotes: 1

Views: 961

Answers (3)

Federico Piazza
Federico Piazza

Reputation: 30985

This is a good example to use negative lookbehind:

(?<!\\),

Working demo

enter image description here

Upvotes: 1

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89547

You can use this pattern that checks if there is not a backslash before:

Regex.Split(line, @"(?<!\\), ");

(?<!...) is a lookbehind assertion and means: not preceded by

Upvotes: 1

Lucas Trzesniewski
Lucas Trzesniewski

Reputation: 51330

If you want to use regex, instead of splitting the string I'd suggest to capture the fields by matching the following regex:

\s*((?:\\.|[^\\])+?)\s*(?:,\s*|$)

Demo: http://regex101.com/r/lP8yE1/4

Each match will be a field, and the value will be the contents of capture group 1.

Upvotes: 1

Related Questions