fionpo
fionpo

Reputation: 141

Regex contain match that should not match

Given this ; delimited string


hap;; z
z ;d;hh 
 z;d;hh ;gfg;fdf ;ppp
ap;jj
lo mo;z
d;23
;;io;
b yio;b;12
 b 
a;b;bb;;;34

I am looking to get columns $1 $2 $3 from any line that contains ap or b or o m in column 1

Using this regex

^(?:(.*?(?:ap|b|o m).*?)(?:;([^\r\n;]*))?(?:;([^\r\n;]*))?(?:;.*)?|.*)$

as shown in this demo one can see that line 11 should not be matching, but it does.

Can not use negated character class to match the before and after sections of column 1, as far as I understand.

Any help making line 11, not match?

Upvotes: 0

Views: 73

Answers (2)

Emmanuel Di Pretoro
Emmanuel Di Pretoro

Reputation: 79

Here is a regex that match your data:

^([^;\n]*(?:ap|b|o m)[^;]*);((?(1)[^;]*));?((?(1)[^;]*))$

You can see it in action.

Upvotes: 1

anubhava
anubhava

Reputation: 785581

You may consider this perl one-liner that works like awk:

perl -F';' -MEnglish -ne  'BEGIN {$OFS=";"} print $F[0],$F[1],$F[2] if $F[0] =~ /ap|b|o m/' file

An awk would be even more simpler:

awk 'BEGIN {FS=OFS=";"} $1 ~ /ap|b|o m/{print $1,$2,$3}' file

hap;; z
ap;jj;
lo mo;z;
b yio;b;12
 b ;;

Upvotes: 2

Related Questions