sfactor
sfactor

Reputation: 13062

Perl: How to perform date calculations from a parsed file

I have a csv file that has several columns. Examples,

"00000089-6d83-486d-9ddf-30bbbf722583","2011-09-17 16:25:09","INTNAME","1001","https://mobile.mint.com:443"
"000004c9-92c6-4764-b320-b1403276321e","2011-11-09 13:52:30","INTNAME","2000","http://m.intel.com/content/intel-us/en/shop/shop-landing.html?t=laptop&p=13"

These are samples line from a huge file I need to parse. I need to select only those lines from this file where the 4th column is within a certain list (say 1000, 2000, .....) and second column between certain dates (say 2011-11-01 00:00:00 to 2011-11-15 00:00:00).

So, how do I do those date selection and only output those line in tab delimited form.

In the example only the second row would be chosen and saved in tab delimited form in another file.

Upvotes: 1

Views: 270

Answers (4)

#!/usr/bin/env perl
use strict;
use warnings;

use 5.010;
use utf8;
use Carp;
use Date::Parse;
use English qw(-no_match_vars);

our $VERSION = '0.01';

my @list = qw(1000 2000 3000);

#say "@list";
# if ( '1000' ~~ @list ) {
# say 'done';
# }

#s (say 2011-11-01 00:00:00 to 2011-11-15 00:00:00).

my $start_date = str2time('2011-11-01 00:00:00');
my $end_date   = str2time('2011-11-15 00:00:00');

#my $input_time    = str2time($input_date);
my $RGX_FOUR_FULL = qr{"([^"]+)","([^"]+)","([^"]+)","([^"]+)","([^"]+)"}smo;
my $RGX_DATE_FULL = qr{.*"(\d{4}-\w{2}-\d{2} \d{2}:\d{2}:\d{2})".*}smo;
my @input_data    = <DATA>;

my @res =
grep {
      extract_time($_) >= $start_date
  and extract_time($_) <= $end_date
  and ( extract_four($_) ~~ @list )
} @input_data;

print @res;

#say 'Z';

sub extract_time {
    my ($search_str) = @_;
    $search_str =~ s/$RGX_DATE_FULL/$1/sm;
    return str2time($search_str);
}

sub extract_four {
    my ($search_str) = @_;
    $search_str =~ s/$RGX_FOUR_FULL/$4/sm;
    chomp($search_str);
    #print $search_str;
    return $search_str;
}

__DATA__
"00000089-6d83-486d-9ddf-30bbbf722583","2011-08-17 16:25:09","INTNAME","1001","https://mobile.mint.com:443"
"00000089-6d83-486d-9ddf-30bbbf722583","2011-09-17 16:25:09","INTNAME","1001","https://mobile.mint.com:443"
"000004c9-92c6-4764-b320-b1403276321e","2011-11-09 13:52:30","INTNAME","2000","http://m.intel.com/content/intel-us/en/shop/shop-landing.html?t=laptop&p=13"
"000004c9-92c6-4764-b320-b1403276321e","2011-11-10 14:52:30","INTNAME","4000","http://m.intel.com/content/intel-us/en/shop/shop-landing.html?t=laptop&p=13"
"000004c9-92c6-4764-b320-b1403276321e","2011-11-09 13:52:30","INTNAME","3000","http://m.intel.com/content/intel-us/en/shop/shop-landing.html?t=laptop&p=13"

and you get

"000004c9-92c6-4764-b320-b1403276321e","2011-11-09 13:52:30","INTNAME","2000","http://m.intel.com/content/intel-us/en/shop/shop-landing.html?t=laptop&p=13"
"000004c9-92c6-4764-b320-b1403276321e","2011-11-09 13:52:30","INTNAME","3000","http://m.intel.com/content/intel-us/en/shop/shop-landing.html?t=laptop&p=13"

Upvotes: 0

Toto
Toto

Reputation: 91428

Using Parse::CSV, here is a way to do the job:

#!/usr/local/bin/perl 
use Modern::Perl;
use Parse::CSV;

my $parser = Parse::CSV->new(
    file => 'text.csv',
);
while ( my $value = $parser->fetch ) {
    if ($value->[3] > 1000 && $value->[3] <= 2000
      && $value->[1] gt '2011-11-01 00:00:00' 
      && $value->[1] lt '2011-11-15 00:00:00' ) {
        say "$value->[0] --> OK";
    }else {
        say "$value->[0] --> KO";
    }
}

output:

00000089-6d83-486d-9ddf-30bbbf722583 --> KO
000004c9-92c6-4764-b320-b1403276321e --> OK

You can also use the filter capability:

my $parser = Parse::CSV->new(
    file => 'text.csv',
    filter => sub{
            if ($_->[3] > 1000 && $_->[3] <= 2000
             && $_->[1] gt '2011-11-01 00:00:00' 
             && $_->[1] lt '2011-11-15 00:00:00' ) {
               return $_;
            }else {
                return undef;
            }
        }
);

while ( my $value = $parser->fetch ) {
    # do what you want with the filtered rows
}

Upvotes: 2

derobert
derobert

Reputation: 51157

First, that looks like CSV, so you should use Text::CSV_XS (or Text::CSV) to parse it. The "standard" module to use to handle dates/times in Perl is DateTime which goes along with DateTime::Format::ISO8601 or similar, but Date::Parse is also a possibility.

Upvotes: 1

snoofkin
snoofkin

Reputation: 8895

you may want to take a look at Time::Piece, use it like this (for instance):

# use strftime() formats.
my $time = Time::Piece->strptime($date, "%Y%m%d %H:%M");

(Apply the relevant strftime format for you data)

Upvotes: 1

Related Questions