Reputation: 914
I would like to extract some rows from a large .txt
file:
MYNAME, 2017-03-01, John Wayne, H\
MYNAME, 2017-01-01, Brian Wayne,P\
MYNAME, 2017-02-01, Brian Duffe, TR\
MYNAME, 2017-03-01, Iggor Miller, R\
Having the following file I would like to extract only those people whose name starts with W
:
MYNAME, 2017-03-01, John Wayne,H\
MYNAME, 2017-01-01, Brian Wayne,P\
What I have tried did not work as expected:
/(?:[^\,]*\,){2}([^,]*)/
Where I try to get a W
after the second ,
Appreciate any suggestions!
Upvotes: 3
Views: 1210
Reputation: 627334
Your (?:[^\,]*\,){2}([^,]*)
regex matches any 0+ chars other than ,
followed with ,
exactly 2 times, and then 0+ chars other than ,
. Just adding W
won't work, you need to account for the words before the family names. You might add \s+\S+\s+W
before the last [^,]*
, or use a PCRE regex:
^(?:[^,]*,){2}\h*\S+\h+W.*
See this demo.
Details
^
- start of string/line(?:[^,]*,){2}
- 2 occurrences of any 0+ chars other than ,
followed with ,
\h*
- 0+ horizontal whitespace chars\S+
- 1 or more non-whitespace chars\h+
- 1+ horizontal whitespace charsW
- a W
char .*
- any 0+ chars other than line break chars, as many as possibleAnother alternative (a JS compatible one): match all chars other than ,
after you matched two chunks of non-comma chars followed with a comma, and then match a whitespace + W
:
^(?:[^,]*,){2}[^,]*\sW.*
See this demo.
Here, [^,]*\sW.*
matches any 0+ chars other than ,
as many as possible, and then a whitespace is matched, then W
and then any 0+ chars other than line break chars, as many as possible (the rest of the string/line).
Upvotes: 1