fionpo
fionpo

Reputation: 141

Regex return only non empty after substitution

Given this text string of ; delimited columns:

a;; z
z;d;hh 
d;23
;;io;
b;b;12

a;b;bb;;;34

This regex

^(?:(a|b|z)(?:;|$)([^;\r\n]*)(?:;|$)([^;\r\n]*)(?:;.*)?|.*)$

with this substitution $3 will return the 3rd column, if it exists, from lines whose first column is a, b or z, as shown in this demo

My question is, how to return only the non-empty lines, as in:

    z
hh 
12
bb

Thanks for any help

Upvotes: 1

Views: 61

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626709

You may still do that with a plain regex: you just need to re-arrange your two alternatives in the regex pattern and add an optional line break pattern at the end.

Your pattern has the ^((O)(N)(E)|.*)$ structure, so the second alternative matches the whole line if the first one does not match, but both alternatives will stop at the line end (you are using the multiline flag, so $ matches all positions before a line break char or end of string). So, you need to convert it to ^(?:(O)(N)(E)$|.*$\R?):

^(?:(a|b|z)(?:;|$)([^;\r\n]*)(?:;|$)([^;\r\n]*)(?:;.*)?$|.*$\R?)
                                                       ^^^^^^^^^

See the regex demo, in the regex101 tester, note the use of g and m modifiers.

So, in general, the pattern is

  • ^ - start of a line
  • (?: - start of a non-capturing group (so that ^ could be applied to both alternatives):
    • (a|b|z)(?:;|$)([^;\r\n]*)(?:;|$)([^;\r\n]*)(?:;.*)?$ - your specific pattern capturing necessary substrings, up to the end of line/string
    • | - or
    • .*$ - any 0+ chars other than line break chars, as many as possible, up to the line/string end ($), and then
    • \R? - an optional line break sequence
  • ) - end of the group.

Upvotes: 1

Related Questions