ado
ado

Reputation: 1471

Finding the biggest number in a file (containing both numeric and non numeric elements) with Perl

I really need the help of a Perl hacker. This looks easy but I have been thinking about it for an hour and didnt come with any solution.

Assuming we have a flat or log file like the following.:

    2013-05-27T19:01:23 [INFO] item_id:1, start at Reader.pm line 23
    2013-05-27T19:01:29 [INFO] item_id:2, pause at Reader.pm line 23
    2013-05-27T19:01:30 [INFO] item_id:1, start at Reader.pm line 23
    2013-05-27T19:01:30 [INFO] item_id:1, start at Reader.pm line 23
    2013-05-27T19:01:30 [INFO] item_id:1, start at Reader.pm line 23
    2013-05-27T19:01:30 [INFO] item_id:1, start at Reader.pm line 23
    2013-05-27T19:01:30 [INFO] item_id:3, start at Reader.pm line 23
    2013-05-27T19:01:30 [INFO] item_id:3, start at Reader.pm line 23
    2013-05-27T19:01:30 [INFO] item_id:3, start at Reader.pm line 23
    2013-05-27T19:01:30 [INFO] item_id:5, start at Reader.pm line 23
    2013-05-27T19:01:30 [INFO] item_id:5, start at Reader.pm line 23
    2013-05-27T19:01:30 [INFO] item_id:5, start at Reader.pm line 23
    2013-05-27T19:01:30 [INFO] item_id:5, start at Reader.pm line 23
    (...)

I have to make a method that finds the biggest item_id number, in this case 5 and locate it in a variable $found. Notice that we do not know a priori which is the largest number, so I cannot use grep because i would need to put the "biggest number (that is, 5 in this case)" as an input. The only input we have is the location of the file. What do you suggest?

Upvotes: 1

Views: 917

Answers (4)

Schwern
Schwern

Reputation: 165198

List::Util has a max() function which will select the largest number.

use List::Util qw(max);

my @ids;
while(my $line = <$fh>) {
    my($id) = $line =~ /item_id:(\d+)/;
    push @ids, $id;
}

print max(@ids);

For edification, max is a pretty straight forward function to implement.

sub max {
    my $max;
    for my $num (@_) {
        $max = $num if $num > $max;
    }

    return $max;
}

If you have a tremendous number of lines you can do the max calculation in the loop to avoid having to store a list.

my $max;
while(my $line = <$fh>) {
    my($id) = $line =~ /item_id:(\d+)/;
    $max = $id if $id > $max;
}

Upvotes: 3

Borodin
Borodin

Reputation: 126742

This solution is very straightforward. It expects the name of the input file as a parameter on the command line.

use strict;
use warnings;

my $found = 0;

while (<>) {
  next unless /item_id:(\d+)/;
  $found = $1 if $found < $1;
}

print "Found: $found";

output

Found: 5

Update

If all you want is the value then there is this command-line version

perl -ne "/item_id:(\d+)/ && $f<$1 and $f=$1; END{print $f}" data.txt

Upvotes: 4

perreal
perreal

Reputation: 98058

As a Perl one-liner:

perl -lne '{$s{$1}++ if /item_id:(\d+)/} END{print ((sort keys %s)[-1])}' input

or,

perl -nle '{$m = $1 if /item_id:(\d+)/ and $1 > $m} END{print $m}' input

Because you mentioned grep, here is a way to do this using command line tools:

cut -d' ' -f3 input | sed 's/[^:]*:\([0-9]*\),/\1/' | sort -nr | head -1

or,

sed 's/.*item_id:\([0-9]*\),.*/\1/' input | sort -nr | head -1

Upvotes: 2

user1149862
user1149862

Reputation:

Just read in every line of the log file and use regex to pick up the game ids, initialize the game id with the first one, and replace it when you get a larger id.

use strict;
use warnings;

my $location = "file.txt";
open LOGFILE, $location;

my $first_line = 1;
my $max_id;

while (<LOGFILE>) {
    if (/item_id:(\d)+/) {
        if ($first_line) {
            $first_line = 0;
            $max_id = $1;
        } else {
            $max_id = $1 if ($1 > $max_id);
        }
    }
}

my $found = $max_id;
print "$found\n";

close LOGFILE;

Upvotes: 3

Related Questions