Reputation: 217
I have a large number of CSV files with double quotes as the entry delimiter, but some entries have the same character, as in below.
"MAIN 8" PIPE, PART B","Report 7"
I'm trying to just match the extra '"' character so that I can replace it with another character to read in the files.
I tried using the regex ([^","])"([^","])
. But it matched the characters 8"
and 7"
, where I would just like the "
in the middle of 8"
.
I know I need to get it to check the endline character and not match the characters around the "
that I am looking for. Any tips to do this are appreciated.
Upvotes: 2
Views: 2245
Reputation: 10360
Try this Regex:
(?<!^|",)"(?!,"|$)
OR
Replace each match with some other character, say #
Explanation:(1st Regex)
(?<!^|",)
- matches a position which is immediately neither preceded by start of the string nor by the sequence ",
"
- matches "
literally(?!,"|$)
- matches a position which is not immediately followed by either end of the string or the sequence ,"
Upvotes: 1
Reputation: 10476
The source of the problem should be fixed in my opinion. However you may try the following solution to find the double quote and replace them accordingly:
Search By:
(?<!,)(?<!^)\"+(?!\s*(?:,|$))
And Replace by anything you want
The above will work but to make it more safe you should first do some replace operation , otherwise neither mine nor any other regex solution will work for the case like this:
"abac"c " , " blaljak"sdf "
So what you need to do is remove any space the that sits before or after a double quote.
So apply this replace operations first:
\s*(\")\s*
and replace by this:
\1
then apply the original regex i provided.
Upvotes: 1
Reputation: 16968
I'm using Notepad++, So I can use below regex in Replace window to do it:
Find What:
(^|,)([^"][^,]+?)"(?=,|$)
Replace with:
\1\2
Upvotes: 0