David Boord
David Boord

Reputation: 83

How can I use Perl extract a particular column from a tab-separated file?

I am really new at Perl and have been trying to piece together a solution for this. When I run this program I don't get any errors and it doesn't display anything.

The code is as follows:

#!/usr/bin/perl
open (DATA, "<test1.txt") or die ("Unable to open file");
use strict; use warnings;
my $search_string = "Ball";
while ( my $row = <DATA> ) {

    last unless $row =~ /\S/;
    chomp $row;
    my @cells = split /\t/, $row;

    if ($cells[0] =~/$search_string/){
        print $cells[0];
    }
}

my test data file looks like this

Camera Make     Camera Model    Text    Ball    Swing
a       b       c       d       e
f       g       h       i       j
k       l       m       n       o

I am trying to see how this works before i use the actual test data file..

So how do I search for say "Ball" and have it return "d i n"

Upvotes: 0

Views: 14540

Answers (4)

Axeman
Axeman

Reputation: 29854

Try this out:

use strict;
use warnings;
use Data::Dumper;
use List::MoreUtils qw<first_index>;

my $column = first_index { $_ eq 'Ball' } split /\t/, <DATA>;
say Data::Dumper->Dump( [ $column ], [ '*column' ] );
my @balls  = map { [split /\t/]->[$column] } <DATA>;
say Data::Dumper->Dump( [ \@balls ], [ '*balls' ] );
__DATA__
Camera Make Camera Model    Text    Ball    Swing
a   b   c   d   e
f   g   h   i   j
k   l   m   n   o

You would pretty much have to change the handle from DATA to some file you open-ed.

open( my $in, '<', '/path/to/data.file' ) 
    or die "Could not open file: $!"
    ;

And then substitute <DATA> for <$in>.

Upvotes: 2

TLP
TLP

Reputation: 67920

You can use Text::CSV_XS to very conveniently extract the data for you. It might be overkill for your limited data, but it is a very solid solution.

Here I just use the DATA tag to contain the data, but if you prefer, you can replace that with a filehandle, such as open my $fh, '<', 'text1.txt'; and change *DATA to $fh.

Output:

d i n

Code:

use warnings;
use strict;
use Text::CSV_XS;
use autodie;

my $csv = Text::CSV_XS->new( { sep_char => "\t" } );
my @list;
$csv->column_names ($csv->getline (*DATA));
while ( my $hr = $csv->getline_hr(*DATA) ) {
    push @list, $hr->{'Ball'};
}

print "@list\n";
__DATA__
Camera Make Camera Model    Text    Ball    Swing
a   b   c   d   e
f   g   h   i   j
k   l   m   n   o

ETA: If you're going to cut & paste to try it out, make sure that the tabs are carried over in the data.

Upvotes: 0

Conspicuous Compiler
Conspicuous Compiler

Reputation: 6469

Try this instead:

#!/usr/bin/perl
use strict;
use warnings;

open (DATA, "<test1.txt") or die ("Unable to open file");
my $search_string = "Ball";

my $header = <DATA>;
my @header_titles = split /\t/, $header;
my $extract_col = 0;

for my $header_line (@header_titles) {
  last if $header_line =~ m/$search_string/;
  $extract_col++;
}

print "Extracting column $extract_col\n";

while ( my $row = <DATA> ) {
  last unless $row =~ /\S/;
  chomp $row;
  my @cells = split /\t/, $row;
  print "$cells[$extract_col] ";
}

Upvotes: 0

DVK
DVK

Reputation: 129549

The reason you don't get any errors is because your program does exactly what you told it to (print all first column values that contain the string "Ball"). Since none of the cells in the first column contain that string, your program prints nothing.

Your problem is not with your Perl (it could use some minor stylistic improvement - specifically you're using obsolete form of open() - but is mostly fine), it's with your algorithm.

HINT: your first task in the algorithm should be finding WHICH column (by number) is the "Ball" column.

Upvotes: 2

Related Questions