brucezepplin
brucezepplin

Reputation: 9752

delete all characters in line after certain string

Hi I have the following file:

>seq0 id345
FQTWEEFSRAAEKLYLADPMKVRVVLKYRHVDGNLCIKVTDDLVCLVYRTDQAQDVKKIEKF
>seq1 id1045
KYRTWEEFTRAAEKLYQADPMKVRVVLKYRHCDGNLCIKVTDDVVCLLYRTDQAQDVKKIEKFHSQLMRLME 

and I am trying to remove any character after the > so i get:

>
FQTWEEFSRAAEKLYLADPMKVRVVLKYRHVDGNLCIKVTDDLVCLVYRTDQAQDVKKIEKF
>
KYRTWEEFTRAAEKLYQADPMKVRVVLKYRHCDGNLCIKVTDDVVCLLYRTDQAQDVKKIEKFHSQLMRLME  

I have almost got this using:

sed -e 's/>.*//'

however this also deletes the > symbols leaving me with:

FQTWEEFSRAAEKLYLADPMKVRVVLKYRHVDGNLCIKVTDDLVCLVYRTDQAQDVKKIEKF

KYRTWEEFTRAAEKLYQADPMKVRVVLKYRHCDGNLCIKVTDDVVCLLYRTDQAQDVKKIEKFHSQLMRLME 

how do i keep the > characters?

Thanks.

Upvotes: 1

Views: 1992

Answers (2)

Lev Levitsky
Lev Levitsky

Reputation: 65781

The simplest fix would be:

sed 's/>.*/>/'

Upvotes: 3

Gilles Quénot
Gilles Quénot

Reputation: 184975

A re-usable solution for more complicated cases (using a capturing group):

sed -r 's/(>).*/\1/'

Upvotes: 3

Related Questions