clarkseth
clarkseth

Reputation: 229

What's the better way to get highest couple of values from an array of keys in Perl?

What's the better way to get the highest value from an array of hashes? I want to get highest ID value from each file, content in my array (keys are file name and ID).

my @array contains these values

[
    { file => "messages0.0", id => "1", },
    { file => "messages0.1", id => "2", },
    { file => "messages0.3", id => "3", },
    { file => "messages1.0", id => "1", },
    { file => "messages1.1", id => "2", },
    { file => "messages2.0", id => "1", },
    { file => "messages2.1", id => "1", }
]

If I use

my @new_array = sort { $b->{id} <=> $a->{id} } @array; 

If I have value greater than 10 then sort function doesn't works correctly

messages0.0.log;1
messages1.0.log;1
messages2.0.log;1
messages2.1.log;1
messages1.0.log;10
messages1.0.log;11

Here is my array content (with field separated by ; for a better view

messages1.0.log;12
messages1.0.log;11
messages1.0.log;10
messages1.0.log;9
messages0.0.log;8
messages1.0.log;8
messages0.0.log;7
messages1.0.log;7
messages0.0.log;6
messages1.0.log;6
messages0.0.log;5
messages1.0.log;5
messages2.0.log;5
messages2.1.log;5
messages0.0.log;4
messages1.0.log;4
messages2.0.log;4
messages2.1.log;4
messages2.0.log;3
messages2.1.log;3
messages0.0.log;3
messages0.2.log;3
messages0.3.log;3
messages1.0.log;3
messages2.0.log;3
messages2.1.log;3
messages0.3.log;2
messages0.2.log;2
messages0.0.log;2
messages1.0.log;2
messages2.0.log;2
messages2.1.log;2
messages0.0.log;1
messages0.2.log;1
messages0.3.log;1
messages1.0.log;1
messages1.1.log;1
messages2.0.log;1
messages2.1.log;1

My desired output is

messages1.0.log;12
messages0.0.log;8
messages2.0.log;5
messages2.1.log;5
messages0.2.log;3
messages0.3.log;3
messages1.1.log;1
#!/usr/bin/perl

use strict;
use warnings;

my $STAT = ".logstatistics";

open( STAT, '>', $STAT ) or die $!;

my @new_array = sort { $b->{id} <=> $a->{id} } @array;

# Print Log statistics
foreach my $entry ( @new_array ) {
    print STAT join ';', $entry->{file}, "$entry->{id}\n";
}

close( STAT );

To help me with the analysis I've written the following code to load the array from a file

open( STAT, $STAT );

while ( <STAT> ) {
    my @lines = split /\n/;
    my ( $file, $id ) = $lines[0] =~ /\A(.\w.*);(\d.*)/;
    push @array, { file => $file, id => $id, };
}

close( STAT );

I've solved my problem with an if statement into data loading into @array. if the old value of the file name is the same as the current value it is skipped. In this way, I have only one value for each file.

Upvotes: 1

Views: 108

Answers (2)

Dave Cross
Dave Cross

Reputation: 69314

This seems to do what you want.

#!/usr/bin/perl

use strict;
use warnings;
use feature 'say';

# This seems to be the data structure that you are working with
my @data = ( {
  file => 'messages1.0.log', id => 12,
}, {
  file => 'messages1.0.log', id => 11,
}, {
  file => 'messages1.0.log', id => 10,
}, {
  file => 'messages1.0.log', id => 9,
}, {
  file => 'messages0.0.log', id => 8,
}, {
  file => 'messages1.0.log', id => 8,
}, {
  file => 'messages0.0.log', id => 7,
}, {
  file => 'messages1.0.log', id => 7,
}, {
  file => 'messages0.0.log', id => 6,
}, {
  file => 'messages1.0.log', id => 6,
}, {
  file => 'messages0.0.log', id => 5,
}, {
  file => 'messages1.0.log', id => 5,
}, {
  file => 'messages2.0.log', id => 5,
}, {
  file => 'messages2.1.log', id => 5,
}, {
  file => 'messages0.0.log', id => 4,
}, {
  file => 'messages1.0.log', id => 4,
}, {
  file => 'messages2.0.log', id => 4,
}, {
  file => 'messages2.1.log', id => 4,
}, {
  file => 'messages2.0.log', id => 3,
}, {
  file => 'messages2.1.log', id => 3,
}, {
  file => 'messages0.0.log', id => 3,
}, {
  file => 'messages0.2.log', id => 3,
}, {
  file => 'messages0.3.log', id => 3,
}, {
  file => 'messages1.0.log', id => 3,
}, {
  file => 'messages2.0.log', id => 3,
}, {
  file => 'messages2.1.log', id => 3,
}, {
  file => 'messages0.3.log', id => 2,
}, {
  file => 'messages0.2.log', id => 2,
}, {
  file => 'messages0.0.log', id => 2,
}, {
  file => 'messages1.0.log', id => 2,
}, {
  file => 'messages2.0.log', id => 2,
}, {
  file => 'messages2.1.log', id => 2,
}, {
  file => 'messages0.0.log', id => 1,
}, {
  file => 'messages0.2.log', id => 1,
}, {
  file => 'messages0.3.log', id => 1,
}, {
  file => 'messages1.0.log', id => 1,
}, {
  file => 'messages1.1.log', id => 1,
}, {
  file => 'messages2.0.log', id => 1,
}, {
  file => 'messages2.1.log', id => 1,
});

my %stats;

# Walk your input data, making a note of the highest
# id associated with every file.
for (@data) {
  if (($stats{$_->{file}} // 0) < $_->{id}) {
    $stats{$_->{file}} = $_->{id};
  }
}

# Walk the %stats hash in sorted order, printing
# the file and the maximum associated id.
for ( sort my_clever_sort keys %stats) {
  say join ';', $_, $stats{$_};
}

# (Slightly) clever sorting algorithm
sub my_clever_sort {
  # Extract the floating point numbers from the filenames
  my ($str_num_a) = $a =~ /(\d+\.\d+)/;
  my ($str_num_b) = $b =~ /(\d+\.\d+)/;

  # Sort by id (descending) and then filename (ascending)
  return ($stats{$b} <=> $stats{$a}) || ($str_num_a <=> $str_num_b);
}

Upvotes: 1

J-L
J-L

Reputation: 1901

Instead of

my @new_array = sort { $a->{id} cmp $b->{id} } @array;

try this

my @new_array = sort { $a->{id} <=> $b->{id} } @array;

The <=> operator treats the fields to compare as numbers instead of strings. It will treat 10 as greater than 3, so it will treat 10 as greater than 03.

The cmp operator treats your values as strings, so it will sort 21 before 3 just as it would sort BA before C.

Upvotes: 1

Related Questions