Perl grep through large file to match a string

Question

I have an array (@array) which has list of elements. I need to check whether these each of the elements are exists in master file or not. If the element exists in master file then in the same line of master file the string YES (in 5th position) should also exists. And the element should be stored in different array.

Actually my script uses two grep shell command to achieve this. How can I write same thing in Perl do grep.

...
use Data::Dumper;

my @new_array;
my @array = ('RT0AC1', 'WG3RA3');

print Dumper(\@array);

foreach ( @array ){
    my $line = `grep $_ "master_file.csv" | grep -i yes`;
    next unless($line);
    push( @new_array, $_ );
}

print Dumper(@new_array);
...

where master_file.csv looks like this:

101,RT0AC1,CONNECTED,FAULTY,NO
102,RT0AC1,CONNECTED,WORKING,YES
103,RT0AC1,NOT CONNECTED,WORKING,NO
104,WG3RA3,NOT CONNECTED,DISABLED,NO
105,WG3RA3,CONNECTED,WORKING,NO

So Here I am getting $line value as 102,RT0AC1,CONNECTED,WORKING,YES and element RT0AC1 is getting stored in @new_array.

How can I avoid using backtick(`) and two greps to achieve this. I am trying to do this using pure Perl. Also the master_file.csv contains millions of records.

Shawn · Accepted Answer

Since all the words you're looking for are in the same location, it's easy to just split up the current line on commas and see if the second column exists in a hash table, and if the fifth column is equal to "YES":

#!/usr/bin/env perl
use strict;
use warnings;
use autodie;
use Data::Dumper;

my $filename = shift // "master_file.csv"; # Default filename if not given on command line
my @array = qw/RT0AC1 WG3RA3/; # Words you're looking for
my %words = map { $_ => 1 } @array; # Store them in a hash for fast lookup
my @new_array;

# Use Text::CSV_XS for non-trivial CSV files
open my $csv, "<", $filename;
while (<$csv>) {
    chomp;
    my @F = split /,/;
    push @new_array, $F[1] if exists $words{$F[1]} && $F[4] eq "YES";
}

print Dumper(\@new_array);

Perl grep through large file to match a string

Answers (2)

Related Questions