Reputation: 141
Given this text string of ;
delimited columns:
a;; z
z;d;hh
d;23
;;io;
b;b;12
a;b;bb;;;34
This regex
^(?:(a|b|z)(?:;|$)([^;\r\n]*)(?:;|$)([^;\r\n]*)(?:;.*)?|.*)$
with this substitution $3
will return the 3rd column, if it exists, from lines whose first column is a
, b
or z
, as shown in this demo
My question is, how to return only the non-empty lines, as in:
z
hh
12
bb
Thanks for any help
Upvotes: 1
Views: 61
Reputation: 626709
You may still do that with a plain regex: you just need to re-arrange your two alternatives in the regex pattern and add an optional line break pattern at the end.
Your pattern has the ^((O)(N)(E)|.*)$
structure, so the second alternative matches the whole line if the first one does not match, but both alternatives will stop at the line end (you are using the multiline flag, so $
matches all positions before a line break char or end of string). So, you need to convert it to ^(?:(O)(N)(E)$|.*$\R?)
:
^(?:(a|b|z)(?:;|$)([^;\r\n]*)(?:;|$)([^;\r\n]*)(?:;.*)?$|.*$\R?)
^^^^^^^^^
See the regex demo, in the regex101 tester, note the use of g
and m
modifiers.
So, in general, the pattern is
^
- start of a line(?:
- start of a non-capturing group (so that ^
could be applied to both alternatives):
(a|b|z)(?:;|$)([^;\r\n]*)(?:;|$)([^;\r\n]*)(?:;.*)?$
- your specific pattern capturing necessary substrings, up to the end of line/string|
- or .*$
- any 0+ chars other than line break chars, as many as possible, up to the
line/string end ($
), and then\R?
- an optional line break sequence)
- end of the group.Upvotes: 1