Reputation: 37
I want to use the sed command to delete some specific strings.
This is the file(tRNA.fa):
>tRNA-Ala-AGC-1-1 (chrII.trna5-AlaAGC) chrII:4565386-4565457 (+) Ala (AGC) 72 bp Sc: 72.4
GGGGGTATAGCTCAGTGGTAGAGCGCTCCCTTAGCATGGGAGAGGgCTGGGGTTCAATTC
CCCATACCTCCA
>tRNA-Ala-AGC-1-10 (chrX.trna261-AlaAGC) chrX:7378738-7378809 (-) Ala (AGC) 72 bp Sc: 72.4
GGGGGTATAGCTCAGTGGTAGAGCGCTCCCTTAGCATGGGAGAGGgCTGGGGTTCAATTC
CCCATACCTCCA
>tRNA-Ala-AGC-1-11 (chrX.trna260-AlaAGC) chrX:7507245-7507316 (-) Ala (AGC) 72 bp Sc: 72.4
GGGGGTATAGCTCAGTGGTAGAGCGCTCCCTTAGCATGGGAGAGGgCTGGGGTTCAATTC
CCCATACCTCCA
I just want to keep “>tRNA-XXX-XXX-X-X”and the next line.
So,I tried to replace unnecessary information with sed command:
sed -i 's/\(.*\).*[0-9]$//g' tRNA.fa
However,I deleted all the line starting with '>'.
The result I hope to get is:
>tRNA-Ala-AGC-1-1
GGGGGTATAGCTCAGTGGTAGAGCGCTCCCTTAGCATGGGAGAGGgCTGGGGTTCAATTC
CCCATACCTCCA
>tRNA-Ala-AGC-1-10
GGGGGTATAGCTCAGTGGTAGAGCGCTCCCTTAGCATGGGAGAGGgCTGGGGTTCAATTC
CCCATACCTCCA
>tRNA-Ala-AGC-1-11
GGGGGTATAGCTCAGTGGTAGAGCGCTCCCTTAGCATGGGAGAGGgCTGGGGTTCAATTC
CCCATACCTCCA
If you know how to replace it, please tell me,thank you.
Upvotes: 0
Views: 46
Reputation: 140880
If you want to match a (
don't escape it.
sed -i 's/(.*).*[0-9]$//g' tRNA.fa
But really the following is just enough to remove everything after (
:
sed -i 's/(.*//' tRNA.fa
Note that you may also want s/ (.*//
to remove that space before (
too.
The \(...\)
is used to group expression into a subgroup. It is most often used for back references, but may also be used for example as \(abc\)*
- will search zero or more occurrences of the string "abc"
.
Here is a great sed introduction.
Upvotes: 1