Perlnika
Perlnika

Reputation: 5066

cannot match multiple occurrences of character in sed regexp

I am trying to remove As at the end of line.

alice$ cat pokusni 
SALALAA
alice$ sed -n 's/\(.*\)A$/\1/p' pokusni 
SALALA

one A is removed just fine

alice$ sed -n 's/\(.*\)A+$/\1/p' pokusni 
alice$ sed -n 's/\(.*\)AA*$/\1/p' pokusni
SALALA

multiple occurrences not:(

I am probably doing just some very stupid mistake, any help? Thanks.

Upvotes: 0

Views: 1465

Answers (4)

Vijay
Vijay

Reputation: 67221

You can use perl:

> echo "SALALAA" | perl -lne 'if(/(.*?)[A]+$/){print $1}else{print}'
SALAL

Upvotes: 1

Jotne
Jotne

Reputation: 41456

Using awk

awk '{sub(/AA$/,"A")}1' pokusni 
SALALA

EDIT Correct version, removing all A from end of line.

awk '{sub(/A*$/,x)}1' pokusni 

Upvotes: 1

potong
potong

Reputation: 58401

This might work for you:

sed -n 's/AA*$//p' file

This replaces an A and zero or more A's at the end of line with nothing.

N.B.

sed -n 's/A*$//p file' 

would produce the correct string however it would operate on every line and so produce false positives.

Upvotes: 2

Naruil
Naruil

Reputation: 2310

Try this one 's/\(.*[^A]\)AA*$/\1/p'


Why + does not work:

Because it is just a normal character here.

Why 's/\(.*\)AA*$/\1/p' does not work:

Because the reg-ex engine is eager, so .* would consume as many as As except the final A specified in AA*. And A* will just match nothing.

Upvotes: 5

Related Questions