Reputation: 301
I have the following file:
id001 word1(100);"word2"(100);"word3"(98);"word4"(98);"word5"(94);word6;
id002 word1(100);word7(100);word8(100);word9(100);word10;word11;
I want split each lines of my file to retrieve id (=id00x), val (=wordX) and int (=100) in array. My code :
my @fields = split /[\t();"]"?/, $line;
$id = $fields[0];
for ( my $i = 1; $i < @fields; $i +=2 )
{
$val=$fields[$i];
$int=$fields[$i+1]
}
I retrieve just id and val that are not between double quote.Please can you give me few leads?
Upvotes: 0
Views: 294
Reputation: 1245
The problem here is the regexp. You can verify this by putting in a loop immediately after doing the split, like this:
my @fields = split /[\t();"]"?/, $line;
$id = $fields[0];
foreach my $field(@fields) {
print("field is $field\n");
}
That will show you that you have several empty fields, and that's why you don't get anything into your variables. The reason for the empty fields is that the regexp will allow any one of the listed characters to act as a word boundary, so when you have more than one of them in succession, they will cause several consecutive splits.
I'd make it easier by not trying to split the entire line at once, Instead I'd start by splitting the line into smaller parts, and then use a regexp to extract the parts. Here's my suggestion:
my @fields = split /[\t;]/, $line;
$id = $fields[0];
for ( my $i = 1; $i < $#fields; $i++ )
{
($val, $int) = $fields[$i] =~ /\"?(\w+)\"?\((\d+)\)/;
print("val is $val, int is $int\n");
}
Also note that the way to get the number of objects in an array is $#arrayname
, not @arrayname
. The latter also works in a scalar context, but it's a bad habit to get into.
Below here is the original answer, which was just about syntax
Here's at least one error:
$val=$fields[i];
$int=$fields[i+1]
You need to use $
before the i
as well, like so:
$val=$fields[$i];
$int=$fields[$i+1]
Upvotes: 2