Reputation: 7266
I have this regex used to split a string:
,(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)
e.g. string
"Field1","Field2","item1,item2,item3","Hello,""John"""
The one thing I understand is it is splitting the string on , but anything after that I am not sure.
If anyone can explain this Regex please.
If you can dissect it to the simplest possible level, I would appreciate it.
Upvotes: 1
Views: 606
Reputation: 785058
This regex is matching a comma ,
only if it is outside double quotes by counting even number of quotes after literal ,
.
Explanation:
, -> match literal comma
(?=...) -> positive lookahead
[^"]*" -> match anything before a " followed by a literal "
[^"]*"[^"]*" -> match a pair of above
(?:[^"]*"[^"]*")* -> Match 0 or more of pairs (0, 2, 4, 6 sets)
[^"]*$ -> Followed by any non-quote till end of string
Example Input:
"Field1,Field2","Field3","item1,item2,item3"
,
before "Field3"
because lookahead: (?=(?:[^"]*"[^"]*")*[^"]*$)
is making sure there are 4 double quotes after this comma.,
after "Field3"
because lookahead: (?=(?:[^"]*"[^"]*")*[^"]*$)
is making sure there are 2 double quotes after this comma.Field1
and Field2
because # of quotes after that are odd in numbers and hence lookahead (?=(?:[^"]*"[^"]*")*[^"]*$)
will fail.Upvotes: 4
Reputation: 67968
,(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)
This will not split on ,
which are inside "
and "
.This says that after every ,
there will be groups of something " something"
.So effectively ,
cannot be in between "
and "
.
Upvotes: 3