Reputation: 923
I have this kind of data and need to (1) retrieve the first element (5 elements in each "number cluster", separated by ":") within each cluster and (2) group the retrieved elements every 3 elements per group.
chr1 69270 . A G 1/1:208,34:244:14.96:118,15,0 0/1:186,51:241:8.72:80,0,9 0/0:226,1:236:3.01:0,3,30 ./. 1/1:209,35:250:12:116,12,0 ./. 1/1:186,53:242:14.97:126,15,0 0/0:245,0:248:3.01:0,3,33 1/1:182,60:243:23.95:201,24,0
I am sure there are better way to do it. But as of now, I could only think of using brutal force, which is always bad. The other option is to use dynamic scalars but essentially dynamic scalars would do exactly what the bad code below does. I don't see much improvement and others at stackoverflow said it's (also) always bad to use dynamic scalars.
I am still reading perl beginning so don't know what other options are available. Any help will be appreciated.
my @genotype1 = split (/:/, $original_line[6]);
my @genotype2 = split (/:/, $original_line[7]);
my @genotype3 = split (/:/, $original_line[8]);
my @genotype4 = split (/:/, $original_line[9]);
my @genotype5 = split (/:/, $original_line[10]);
my @genotype6 = split (/:/, $original_line[11]);
my @genotype7 = split (/:/, $original_line[12]);
my @genotype8 = split (/:/, $original_line[13]);
my @genotype9 = split (/:/, $original_line[14]);
my @trio1 = ($genotype1[0], $genotype2[0], $genotype3[0]);
my @trio2 = ($genotype4[0], $genotype5[0], $genotype6[0]);
my @trio3 = ($genotype7[0], $genotype8[0], $genotype9[0]);
Upvotes: 0
Views: 86
Reputation: 385897
If you were using "dynamic variables", you would have
for (6..14) {
@{ "genotype".($i-6) } = split (/:/, $original_line[$i]);
}
Just change it to
my @genotypes;
for (6..14) {
@{ $genotypes[$i-6] } = split (/:/, $original_line[$i]);
}
which might be a bit cleaner as
my @genotypes;
for (6..14) {
$genotypes[$i-6] = [ split (/:/, $original_line[$i]) ];
}
or
my @genotypes;
for (6..14) {
push @genotypes, [ split (/:/, $original_line[$i]) ];
}
or
my @genotypes;
for (@original_line[6..14]) {
push @genotypes, [ split /:/ ];
}
or
my @genotypes = map { [ split /:/ ] } @original_line[6..14];
But you only need the first element, so you can use
my @genotypes = map { ( split /:/ )[0] } @original_line[6..14];
Then, all you need is to grab three elements from that array at a time, so you get:
my @genotypes = map { ( split /:/ )[0] } @original_line[6..14];
my @trioes;
while (@genotypes) {
push @trios, [ splice @genotypes, 0, 3 ];
}
Upvotes: 3
Reputation: 57490
There are several different ways to make your code more efficient; most (all?) would take advantage of the fact that you're only ever using the first element of each @genotype
list. One example:
my @elements = map { (split /:/)[0] } @original_line[6..14];
my @trio1 = @elements[0,1,2];
my @trio2 = @elements[3,4,5];
my @trio3 = @elements[6,7,8];
Upvotes: 3