Reputation: 760
Having kind of this line: (example of a line in a wannabe CSV)
""",100,""a"sa",""," "","" ","a"z","a"",""z","""",""",200,"a"a",""
I want a regex that match all quotation mark "
that are not enclosing the strings... (to remove them in a later stage and build a 100% compliant CSV)
I came up with this partial solution: (?<!,)"(?!,)
using negative lookbehind and lookahead to match only not enclosing "
It almost made the trick but the very first character and the last one of each line, both "
, are matched too by the regex.
example: https://regexr.com/41kve
I want a regex that workaround this so first and last character are not part of the matching regex
Some ideas how to do it?
Upvotes: 2
Views: 49
Reputation: 626748
You may use
(?<!,|^)"(?!\s*(?:,|$))
See the regex demo
Details
(?<!,|^)
- no start of string or comma immediately to the left of the current location"
- a double quotation mark(?!\s*(?:,|$))
- no ,
or end of string preceded with 0+ whitespaces immediately to the right of the current location.Upvotes: 2