Reputation: 5
I have a file where I want to remove everything before and including the first space for each line. For example, if my file looks like this:
>JQ907469.1 Gracilariopsis mclachlanii voucher BG0072 23S ribosomal RNA gene, partial sequence; plastid
>JQ907467.1 Gracilariopsis longissima voucher BG0052 23S ribosomal RNA gene, partial sequence; plastid
>JQ907456.1 Hydropuntia rangiferina voucher BG0092 23S ribosomal RNA gene, partial sequence; plastid
>JQ907428.1 Gracilaria cornea voucher BG0112 23S ribosomal RNA gene, partial sequence; plastid
>JQ952662.1 Gracilariopsis tenuifrons voucher BG0042 23S ribosomal RNA gene, partial sequence; plastid
I want it to look like this
Gracilariopsis mclachlanii voucher BG0072 23S ribosomal RNA gene, partial sequence; plastid
Gracilariopsis longissima voucher BG0052 23S ribosomal RNA gene, partial sequence; plastid
Hydropuntia rangiferina voucher BG0092 23S ribosomal RNA gene, partial sequence; plastid
Gracilaria cornea voucher BG0112 23S ribosomal RNA gene, partial sequence; plastid
Gracilariopsis tenuifrons voucher BG0042 23S ribosomal RNA gene, partial sequence; plastid
I assume I can use sed to achieve my goal, but I'm not familiar enough with the notation and syntax of it yet to experiment. In the spirit of that, I'd love it if someone has a solution if they could explain why the code works the way it does.
Cheers
Upvotes: 0
Views: 522
Reputation: 15206
Employing a regex, and assuming you're using a reasonably current GNU sed:
sed -r 's/[^ \t]+[ \t]//' yourfile
If you're happy with how that looks, make that
sed -i -r 's/[^ \t]+[ \t]//' yourfile
How does it work?
s/
starts a search & replace
^[^ \t]+[ \t]
is a regular expression that translates to from the beginning of line match all non-space (or TAB) characters and the first space (or TAB)
//
the slashes, and the one above in the first part of the command, s/
, are separators. The bit between the first two is the search pattern, the bit between the second two is the replacement (in your case, nothing).
-r
tells GNU sed to use enhanced regular expression syntax.
-i
tells it to modify the file in place.
Upvotes: 1