Reputation: 134
I have a CSV that is created and doesn't quote out text comments from a column and includes new lines.
Regular expression for csv with commas and no quotes is a similiar question but he doesn't have another line or additional columns to parse through.
A line of text in the csv can look like this:
1, 15231, 123123, 1231, word word word, YYYY-MM-DD HH:mm:ss.sss, 13453, **This would be the section with any character for users to communicate and the db stores and
new lines to record communication**, YYYY-MM-DD HH:mm:ss.sss, User name, 12412413, 01231231, 123,12,,*ASTERIX USED*, YYYY-MM-DD HH:mm:ss.sss
Then another new line and something like about would parse through,
So far I've tried this
/(\d+?),(\d+?),(\d+?),(\d+?),(.+?),(.+?),(.+?),(.+?),(.+?),(.+?),(.+?),(.+?),(.+?),(.+?),(.+(?=,\d{4})),
But I can't seem to get past the instances if there are date references in the comments section of the csv.
Farely new to regex and the (?=) is new to me as I had to go beyond simple regex patterns.
Upvotes: 0
Views: 124
Reputation: 208665
If you know the exact number of fields that there should be, then you can use the following method:
[^,]*
.*
For example if you have five total fields and the third is entered by the user, you would use the following regex:
([^,]*),([^,]*),(.*),([^,]*),([^,]*)
Example: http://www.rubular.com/r/E6785bWW0R
If the user entered field may contain line breaks, make sure you enable the option so that .
matches line break characters (often s
, or a constant like DOTALL
, in some languages you can prefix your regex with (?s)
). Alternatively, just replace .*
with [\s\S]*
, which will match anything regardless of options used.
Upvotes: 1