Reputation: 67
I'm a begginer in Perl and I want to parse some arguments from a swiss file into a text. I find the way how to parse ID from the swiss file but nothing more so far. I have to take from the file ID AC.
My swiss file looks like this:
ID 140U_DROME Reviewed; 261 AA.
AC P81928; Q9VFM8;
SQ SEQUENCE 261 AA; 29182 MW; 5DB78CF6CFC4435A CRC64;
MNFLWKGRRF LIAGILPTFE GAADEIVDKE NKTYKAFLAS KPPEETGLER LKQMFTIDEF
GSISSELNSV YQAGFLGFLI GAIYGGVTQS RVAYMNFMEN NQATAFKSHF DAKKKLQDQF
TVNFAKGGFK WGWRVGLFTT SYFGIITCMS VYRGKSSIYE YLAAGSITGS LYKVSLGLRG
MAAGGIIGGF LGGVAGVTSL LLMKASGTSM EEVRYWQYKW RLDRDENIQQ AFKKLTEDEN
PELFKAHDEK TSEHVSLDTI K
//
My code:
open(IN, "<transmem_proteins.swiss") or die "Cant open the file";
open(OUT, ">text.txt") or die "Cant open the file";
while(<IN>){
if($_=~/^ID\s{3}(\S+\s)/){
print OUT ">$1| \n";
print OUT "// \n";
}
}
Upvotes: 0
Views: 185
Reputation: 40778
Here is an example of how to extract the data from the swiss file:
use feature qw(say);
use strict;
use warnings;
{
my $data = read_swiss_file();
my @ids;
for my $chunk ( @$data ) {
my ( $item1, $item2, $item3);
if( $chunk =~ /^ID\s{3}(\S+)\s+\S+;\s+(.*)\.\s+$/m ){
$item1 = $1;
$item2 = $2;
$item2 =~ s/\s+//;
}
if( $chunk =~ /^AC\s{3}(\S+);/m ){
$item3 = $1;
}
push @ids, [$item1, $item2, $item3] if defined $item1;
}
my $fn = 'text.txt';
open ( my $fh, '>', $fn ) or die "Could not open file '$fn': $!";
for my $items (@ids) {
say $fh "->", join '|', @$items;
}
close $fh;
}
sub read_swiss_file {
my $fn = 'transmem_proteins.swiss';
open ( my $fh, '<', $fn ) or die "Could not open file '$fn': $!";
my $str = do { local $/; <$fh> };
close $fh;
my @chunks = split /(?m:^\/\/)/, $str;
return \@chunks;
}
Upvotes: 1