Reputation: 605
I have this Perl program which picks data from specific columns starting from a certain row.
#!/usr/bin/perl
# This script is to pick the specific columns from a file, starting from a specific row
# FILE -> Name of the file to be passed at run time.
# rn -> Number of the row from where the data has to be picked.
use strict;
use warnings;
my $file = shift || "FILE";
my $rn = shift;
my $cols = shift;
open(my $fh, "<", $file) or die "Could not open file '$file' : $!\n";
while (<$fh>) {
$. <= $rn and next;
my @fields = split(/\t/);
print "$fields[$cols]\n";
}
My problem is that I am only able to get one column at a time. I want to be able to specify a selection of indices like this
0, 1, 3..6, 21..33
but it's giving me only the first column.
I am running this command to execute the script
perl extract.pl FILE 3 0, 1, 3..6, 21..33
Upvotes: 1
Views: 3017
Reputation: 107090
Perl presents the parameters entered on the command line in an array called @ARGV. Since this is an ordinary array, you could use the length of this array to get additional information. Outside a subroutine, the shift command shifts values from the beginning of the @ARGV
array when you don't give it any parameters.
You could do something like this:
my $file = shift; # Adding || "FILE" doesn't work. See below
my $rn = shift;
my @cols = @ARGV;
Instead of cols being a scalar variable, it's now an array that can hold all of the columns you want. In other words, the first parameter is the file name, the second parameter is the row, and the last set of parameters are the columns you want:
while (<$fh>) {
next if $. <= $rn;
my @fields = split(/\t/);
for my $column ( @columns ) {
printf "%-10.10s", $fields[$column];
}
print "\n";
break; # You printed the row. Do you want to stop?
}
Now, this isn't as fancy pants as your way of doing it where you can give ranges, etc, but it's fairly straight forward:
$ perl extract.pl FILE 3 0 1 3 4 5 6 21 22 23 24 25 26 27 28 29 30 31 32 33
Note I used printf instead of print
so all of the fields will be the same width (assuming that they're strings and none is longer than 10 characters).
I tried looking for a Perl module that will handle range input like you want. I'm sure one exists, but I couldn't find it. You still need to allow for a range of input in @col
like I showed above, and then parse @cols
to get the actual columns.
my $file = shift || "FILE";
?In your program, you're assuming three parameters. That means you need a file, a row, and at least one column parameter. You will never have a situation where not giving a file name will work since it means you don't have a row or a set of columns to print out.
So, you need to look at $#ARGV
and verify it has at least three values in it. If it doesn't have three values, you need to decide what to do at that point. The easy solution is to just abort the program with a little message telling you the correct usage. You could verify if there are one, two, or three parameters and decide what to do there.
Another idea is to use Getopt::Long which will allow you to use named parameters. You can load the parameters with pre-defined defaults, and then change when you read in the parameters:
...
use Getopt::Long;
my $file = "FILE"; # File has a default;
my $row, @cols; # No default values;
my $help; # Allow user to request help
GetOptions (
"file=s" => \$file,
"rows=i => \$rows,
"cols=i" => \@cols,
"help" => $help,
);
if ( "$help" ) {
print_help();
}
if ( not defined $rows ) {
error_out ( "Need to define which row to fetch" );
}
if ( not @cols ) {
error_out ( "Need to define which rows" );
}
The user could call this via:
$ perl extract.pl -file FILE -row 3 -col 0 -col 1 3 4 5 6 21 22 23 24 25 26 27 28 29 30 31 32 33
Note that if I use -col
, by default, GetOptions
will assume that all values after the -col
are for that option. Also note I could, if I want, repeat -col
for each column.
By the way, if you use GetOpt::Long, you might as well use Pod::Usage. POD stands for Plain Ol' Document which is Perl's way of documenting how a program is used. Might as well make this educational. Read up on POD Documentation, the POD Specifications, and the standard POD Style. This is how you document your Perl programming. You can use the perldoc
command (Betcha you didn't know it existed), to print out the embedded Perl POD documentation, and use Pod::Usage to print it out for the user.
Upvotes: 0
Reputation: 126772
In the absence of any other solutions I am posting some code that I have been messing with. It works with your command line as you have described it by concatenating all of the fields after the first and removing all spaces and tabs.
The column set is converted to a list of integers using eval
, after first making sure that it consists of a comma-separated list of either single integers or start-end ranges separated by two or three full stops.
use strict;
use warnings;
use 5.014; # For non-destructive substitution and \h regex item
my $file = shift || "FILE";
my $rn = shift || 0;
my $cols = join('', @ARGV) =~ s/\h+//gr;
my $item_re = qr/ \d+ (?: \.\.\.? \d+)? /ax;
my $set_re = qr/ $item_re (?: , $item_re )* /x;
die qq{Invalid column set "$cols"} unless $cols =~ / \A $set_re \z /x;
my @cols = eval $cols;
open my $fh, '<', $file or die qq{Couldn't open "$file": $!};
while (<$fh>) {
next if $. <= $rn;
my @fields = split /\t/;
print "@fields[@cols]\n";
}
Upvotes: 2
Reputation: 48649
My problem is that I am only able to get one column at a time
You don't understand what perl is passing to your program from the command line:
use strict;
use warnings;
use 5.016;
my $str = "1..3";
my $x = shift @ARGV; # $ perl myprog.pl 1..3
if ($str eq $x) {
say "It's a string";
}
else {
say "It's a range";
}
my @cols = (0, 1, 2, 3, 4);
say for @cols[$str];
--output:--
$perl myprog.pl 1..3
Scalar value @cols[$str] better written as $cols[$str] at 1.pl line 16.
It's a string
Argument "1..3" isn't numeric in array slice at 1.pl line 16.
1
Anything you write on the command line will be passed to your program as a string, and perl won't automatically convert the string "1..3"
into the range 1..3
(in fact your string would be the strange looking "1..3,"
). After throwing some errors, perl sees a number on the front of the string "1..3"
, so perl converts the string to the integer 1. So, you need to process the string yourself:
use strict;
use warnings;
use 5.016;
my @fields = (0, 1, 2, 3, 4);
my $str = shift @ARGV; # perl myprog.pl 0,1..3 => $str = "0,1..3"
my @cols = split /,/, $str;
for my $col (@cols) {
if($col =~ /(\d+) [.]{2} (\d+)/xms) {
say @fields[$1..$2]; # $1 and $2 are strings but perl will convert them to integers
}
else {
say $fields[$col];
}
}
--output:--
$ perl myprog.pl 0,1..3
0
123
Upvotes: 1