Safina A.R
Safina A.R

Reputation: 19

Removing steric (*) from the end of a fasta sequence in a multi fasta file

I have a multifasta file containi g predicted proteins from 2 abinitio tools. Every sequence contains a steric (*) in the end. I want to remove it from the file. my sequences are like this:

>snapgene1
SFLPSAEAIEKVLSHMSRRIIDDMKAELQQPEMRWFWP*
>snapgene2
SFLPSAEAIEKVLSHIIIIAAAAKKKPPFFDDMKAELQQPEMRWFWP*

i want the sequences like this :

>snapgen1
SFLPSAEAIEKVLSHMSRRIIDDMKAELQQPEMRWFWP
>snapgene2
SFLPSAEAIEKVLSHIIIIAAAAKKKPPFFDDMKAELQQPEMRWFWP

Can anyone help me in this. Thankyou

Upvotes: 0

Views: 95

Answers (2)

James Brown
James Brown

Reputation: 37404

In awk, if you keep your fastas in file:

$ awk '{sub(/\*$/,"")}1' file
>snapgene1
SFLPSAEAIEKVLSHMSRRIIDDMKAELQQPEMRWFWP
>snapgene2
SFLPSAEAIEKVLSHIIIIAAAAKKKPPFFDDMKAELQQPEMRWFWP

It replaces trailing * with nothing.

Upvotes: 0

signjing
signjing

Reputation: 98

If the text stored in a file "temp.txt",you can use command :

sed -i "s/*$//" temp.txt

Upvotes: 1

Related Questions