fionpo
fionpo

Reputation: 141

Regex Remove non matching line while substituting

Given this text string of ; delimited columns:

a;b;c
a;x;23
b;b;12

I am looking to get column 3 of every line that has an a in column 1 with ^(a);(.*?);(.*?)$ as shown here.

However, as you can see the full non matching line, is also present in the result, after the substitution.

Any idea on how to get only the 3rd column of the matching lines, without the non matching one.

Thanks

Upvotes: 2

Views: 144

Answers (2)

The fourth bird
The fourth bird

Reputation: 163207

Instead of substituting you could also get the match only starting the match with a. Then match the second column and use \K to forget what is matched so far.

Then match the third column. Then values for column 2 and column 3 can be matched using a negated character class.

^a;[^;\r\n]+;\K[^;\r\n]+$
  • ^ Start of string
  • a; Match literally
  • [^;\r\n]+; Column 2, match any char except ; or a newline
  • \K Reset the match buffer
  • [^;\r\n]+] Column 3, match any char except ; or a newline
  • $ End of string

Regex demo

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626699

You may add a .* alternative to just match the line if the first alternative does not match:

^(?:(a);(.*?);(.*?)|.*)$
 ^^^               ^^^

See the regex demo

NOTE: If there is a requirement to only match two semi-colons in the string, you need to use [^;]* instead of .*?:

^(?:(a);([^;]*);([^;]*)|.*)$

See this regex demo (\n added to the negated character class in the demo to account for the fact that the regex test is performed on a single multiline string, not a set of separate strings).

Upvotes: 1

Related Questions