Reputation: 32326
The quotation marks are not matching in the following file.
# cat t123.txt
"first", "second", "and last
"second", "line", "ok"
"third", "line", "not, "ok"
Only the second line is OK. How do I find the first and third line that do not have consistent quotation marks?
I have tried this based on an article that I found. But it does not return the expected results...
https://regex101.com/r/nhDKA2/4
Upvotes: 0
Views: 34
Reputation: 7616
Strictly speaking, your second line is not standard CSV, which does not support a space after the comma.
You can use this regex to test for valid lines based on your CSV spec:
^(?="[^"]*(", "[^"]*)*"$).*"$
^(?=
... )
- positive lookahead at the beginning for:
"[^"]*
- one quote, and anything non-quote(", "[^"]*)*
- zero or more patterns of ", "...
"$
- expect "
at the end.*$"
- whole pattern must end in "
Notes on this regex:
"this is a ""quote"" in a cell"
99
in "foo",99,"bar"
, which is valid in CSVUpvotes: 1