Reputation: 31
I'm writing a Perl script that requires me to pull out a whole column from a file and manipulate it. For example take out column A and compare it to another column in another file
A B C
A B C
A B C
So far I have:
sub routine1
{
( $_ = <FILE> )
{
next if $. < 2; # to skip header of file
my @array1 = split(/\t/, $_);
my $file1 = $array1[@_];
return $file1;
}
}
I have most of it done. The only problem is that when I call to print the subroutine it only prints the first element in the array (i.e. it will only print one A).
Upvotes: 0
Views: 171
Reputation: 126722
I am sure that what you actually have is this
sub routine1
{
while ( $_ = <FILE> )
{
next if $. < 2; # to skip header of file
my @array1 = split(/\t/, $_);
my $file1 = $array1[@_];
return $file1;
}
}
which does compile, and reads the file one line at a time in a loop.
There are two problems here. First of all, as soon as your loop has read the first line of the file (after the header) the return
statement exits the subroutine, returning the only field it has read. That is why you get only a single value.
Secondly, you have indexed your @array1
with @_
. What that does is take the number of elements in @_
(usually one) and use that to index @array1
. You will therefore always get the second element of the array.
I'm not clear what you expect as a result, but you should write something like this. It accumulates all the values from the specified column into the array @retval
, and passes the file handle into the subroutine instead of just using a global, which is poor programming practice.
use strict;
use warnings;
open my $fh, '<', 'myfile.txt' or die $!;
my @column2 = routine1($fh, 1);
print "@column2\n";
sub routine1 {
my ($fh, $index) = @_;
my @retval;
while ($_ = <$fh>) {
next if $. < 2; # to skip header of file
my @fields = split /\t/;
my $field = $fields[$index];
push @retval, $field;
}
return @retval;
}
output
B B
Upvotes: 1
Reputation: 6204
Here's are a few items for you to consider when crafting a subroutine solution for obtaining an array of column values from a file:
while
loop to avoid a line-number comparison for each file line.split
only the number of columns you need by using split
's LIMIT. This can significantly speed up the process.local
copy of Perl's @ARGV
with the file name, and let Perl handle the file i/o.Borodin's solution to create a subroutine that takes both the file name column number is excellent, so it's implemented below, too:
use strict;
use warnings;
my @colVals = getFileCol( 'File.txt', 0 );
print "@colVals\n";
sub getFileCol {
local @ARGV = (shift);
my ( $col, @arr ) = shift;
<>; # skip file header
while (<>) {
my $val = ( split ' ', $_, $col + 2 )[$col] or next;
push @arr, $val;
}
return @arr;
}
Output on your dataset:
A A
Hope this helps!
Upvotes: 0
Reputation: 35198
Jumping to the end, the following will pull out the first column in your file blah.txt
and put it in an array for you to manipulate later:
use strict;
use warnings;
use autodie;
my $file = 'blah.txt';
open my $fh, '<', $file;
my @firstcol;
while (<$fh>) {
chomp;
my @cols = split;
push @firstcol, $cols[0];
}
use Data::Dump;
dd \@firstcol;
What you have right now isn't actually looping on the contents of the file, so you aren't going to be building an array.
Upvotes: 0
Reputation: 26
Try replacing most of your sub with something like this:
my @aColumn = ();
while (<FILE>)
{
chomp;
($Acol, $Bcol, $Ccol) = split("\t");
push(@aColumn, $Acol);
}
return @aColumn
Upvotes: 0