Remove pattern from each line in fasta file in

Question

I have a fasta file, file.fasta, that has the following patterns:

>firstnumber 01abc_numericsequence    
CGTAATCG  
>secondnumber 01abc_anothernumericsequence  
GGTAAACC

and so on, but I'd like the output to be something like:

>firstnumber   
CGTAATCG  
>secondnumber   
CGTAAACC

How can I delete the pattern 01abc and everything that goes after it in each line, and overwrite the file.fasta?

Please, can anyone provide a solution?

justaguy · Accepted Answer

cat fasta

>firstnumber 01abc_numericsequence    
CGTAATCG  
>secondnumber 01abc_anothernumericsequence  
GGTAAACC


awk '/^>/ {$0=$1} 1' fasta

>firstnumber
CGTAATCG  
>secondnumber
GGTAAACC

sed '/^>/ s/ .*//' fasta

>firstnumber
CGTAATCG  
>secondnumber
GGTAAACC

Both the sed and awk replace everything from the first space (inclusive) onward on every line that starts with >

Remove pattern from each line in fasta file in

Answers (2)

Related Questions