david
david

Reputation: 825

Sed keep digit and discard extension

I have a file with a lines that header contains the following pattern (tab separated)

1.mapped.bam 2.mapped.bam 3.mapped.bam ....

I would like

SAMPLE_1 SAMPLE_2 SAMPLE_3 .....

I have tried:

sed -r 's/([0-9])(.mapped.bam)/SAMPLE_\1/g 

but got

1SAMPLE_1 2SAMPLE_2 3SAMPLE_3 ???

Upvotes: 3

Views: 25

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627517

You may use the following POSIX ERE solution:

sed -E 's/([0-9]+)\.mapped\.bam/SAMPLE_\1/g'

An equivalent BRE POSIX solution is

sed 's/\([0-9][0-9]*\)\.mapped\.bam/SAMPLE_\1/g'

See the online sed demo

Here,

  • ([0-9]+) - Group 1 (later referred to with \1 placeholder from the RHS, replacement pattern): one or more digits
  • \.mapped\.bam - a literal .mapped.bam substring.

Note that in both POSIX BRE and ERE dots outside of bracket expressions must be escaped to match literal dots, and capturing parentheses must be escaped in POSIX BRE flavor.

Upvotes: 1

Related Questions