Perlnika
Perlnika

Reputation: 5066

sed too greedy (+ vs *)

I have lines like this:

scaffold157|size21652:7243-9055/0_1813 10 -2127 86.5772 0 272 854 1813 1 185842 186425 147764049 254

I need to remove part from "/" until word boundary (first tab), so in my example this part:

/0_1813

with this result:

scaffold157|size21652:7243-9055 10 -2127 86.5772 0 272 854 1813 1 185842 186425 147764049 254

However, my sed seems to be too greedy with

 sed 's/\/0_.*\b//'

eating all columns. However, with .+, command doesn't work at all and nothing is replaced. What am I doing wrong? Why is .+ not working?

Upvotes: 1

Views: 140

Answers (3)

Ed Morton
Ed Morton

Reputation: 204648

The reason .+ is behaving the way you are seeing is that + is only a metacharacter in EREs and sed uses BREs by default so unless you enable EREs by adding -r or escaping as \+ sed considers + just a literal plus character.

That's an aside though, all you need is:

$ sed 's|/[^[:space:]]*[[:space:]]*||' file
scaffold157|size21652:7243-905510 -2127 86.5772 0 272 854 1813 1 185842 186425 147764049 254

You can probably replace [[:space:]] with \s and [^[:space:]] with \S in some seds, e.g. GNU.

Upvotes: 2

Kent
Kent

Reputation: 195269

I need to remove part from "/" until word boundary (first tab)

here this one-liner gives your expected output:

sed -r 's#/\S*\b##'

Upvotes: 1

konsolebox
konsolebox

Reputation: 75618

Match digits instead:

sed 's/\/0_[0-9]*//'

Or negated spaces:

sed 's/\/0_[^ \t]*//'
sed 's/\/0_[^[:blank:]]*//'
sed -r 's/\/0_\S*\b//'

Probably with negated spaces, \b is no longer necessary.

Upvotes: 1

Related Questions