Perlnika
Perlnika

Reputation: 5066

regexp character or end of line with egrep

I have following regexp:

egrep '(chr1 .*n70$|chr1 .*n70-)' results/files/forbidden_variants

This matches

chr1 n70
chr1 n70-n79
chr1 n70-n79-n83
chr1 n70-n79
chr1 n70-n79-s15-s16
chr1 n70
chr1 n70-n91
chr1 n70

and is terribly slow as I am replacing ids such as n70 with different values millions of times.

Therefore I wanted to get rid of OR. I have written:

egrep '(chr1 .*n70[-\$])' results/files/forbidden_variants

but it is not working as I am not matching end of line with this command. Output looks like this:

chr1 n70-n79
chr1 n70-n79-n83
chr1 n70-n79
chr1 n70-n79-s15-s16
chr1 n70-n91

What am I doing wrong here? :) Thank you.

Upvotes: 0

Views: 114

Answers (2)

blackSmith
blackSmith

Reputation: 3154

Just add a + to the current Regex :

egrep '(chr1 n70[-\$]+)' results/files/forbidden_variants

Upvotes: 1

user1820801
user1820801

Reputation: 114

Why don't you use simply

chr1 n70

you can use a OR

chr1 n70($|-)

which is basically equivalent to your first expression, but in your first expression i don't see the need of .* in your matches.

Upvotes: 0

Related Questions