Reputation: 19
I have a multifasta file containi g predicted proteins from 2 abinitio tools. Every sequence contains a steric (*) in the end. I want to remove it from the file. my sequences are like this:
>snapgene1
SFLPSAEAIEKVLSHMSRRIIDDMKAELQQPEMRWFWP*
>snapgene2
SFLPSAEAIEKVLSHIIIIAAAAKKKPPFFDDMKAELQQPEMRWFWP*
i want the sequences like this :
>snapgen1
SFLPSAEAIEKVLSHMSRRIIDDMKAELQQPEMRWFWP
>snapgene2
SFLPSAEAIEKVLSHIIIIAAAAKKKPPFFDDMKAELQQPEMRWFWP
Can anyone help me in this. Thankyou
Upvotes: 0
Views: 95
Reputation: 37404
In awk, if you keep your fastas in file
:
$ awk '{sub(/\*$/,"")}1' file
>snapgene1
SFLPSAEAIEKVLSHMSRRIIDDMKAELQQPEMRWFWP
>snapgene2
SFLPSAEAIEKVLSHIIIIAAAAKKKPPFFDDMKAELQQPEMRWFWP
It replaces trailing *
with nothing.
Upvotes: 0
Reputation: 98
If the text stored in a file "temp.txt",you can use command :
sed -i "s/*$//" temp.txt
Upvotes: 1