Reputation: 87
Unsorted data
5CM00225_10_16_2017_10_54_42.xml
5CM10538_10_16_2017_11_04_18.xml
1ZM06004_10_16_2017_11_04_14.xml
5XM10010_10_17_2017_08_00_47.xml
5ZM05391_10_15_2017_08_51_07.xml
5ZM05388_10_17_2017_08_01_06.xml
5ZM00058_10_17_2017_08_00_49.xml
NMC00166_10_15_2017_08_51_06.xml
5CM10538_10_15_2017_08_51_06.xml
Expected results
NMC00166_10_15_2017_08_51_06.xml
5CM10538_10_15_2017_08_51_06.xml
5ZM05391_10_15_2017_08_51_07.xml
5CM00225_10_16_2017_10_54_42.xml
1ZM06004_10_16_2017_11_04_14.xml
5CM10538_10_16_2017_11_04_18.xml
5XM10010_10_17_2017_08_00_47.xml
5ZM00058_10_17_2017_08_00_49.xml
5ZM05388_10_17_2017_08_01_06.xml
I use Net::SFTP
to get a directory listing off a remote site and compare to a local file listing. I'd like to sort the list by date in the filename, but I'm running into issues due to there being other information in the string that I need to ignore.
my $sftp = Net::SFTP->new( $host, %args);
my @list = $sftp->ls($path);
open(my $fh, '>', $file); # open a log file to save remote directory listing
my @sorted = map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [$_, $_=~/(\d{2})_(\d{2})_(\d{4})_(\d{2})_(\d{2})_(\d{2})/] } # unsuccessful sorting attempt
@list;
foreach my $item (@sorted) {
$i = ${item}->{filename};
print $fh "$1\n"; # prints each record to the open log file
}
close $fh;
I have done sorting before and plenty of regex but never at the same time, and I'm clearly bungling it up, because it isn't sorting anything, and not throwing any errors.
I thought about extracting the DD_MM_YYYY_hh_mm_ss out of each string and trying to use it as a reference, but I didn't make any usable headway so I scrapped the idea.
Upvotes: 2
Views: 97
Reputation: 6798
Timestamp combined with first 9 characters can be used as hash key.
Then it is just a matter to sort hash on key and output data.
use strict;
use warnings;
use feature 'say';
my %hash;
while(<DATA>) {
chomp;
next unless /(.+?)_(.+?)\.xml/;
$hash{"$2_$1"} = $_;
}
say $hash{$_} for sort keys %hash;
__DATA__
5CM00225_10_16_2017_10_54_42.xml
5CM10538_10_16_2017_11_04_18.xml
1ZM06004_10_16_2017_11_04_14.xml
5XM10010_10_17_2017_08_00_47.xml
5ZM05391_10_15_2017_08_51_07.xml
5ZM05388_10_17_2017_08_01_06.xml
5ZM00058_10_17_2017_08_00_49.xml
NMC00166_10_15_2017_08_51_06.xml
5CM10538_10_15_2017_08_51_06.xml
Output
5CM10538_10_15_2017_08_51_06.xml
NMC00166_10_15_2017_08_51_06.xml
5ZM05391_10_15_2017_08_51_07.xml
5CM00225_10_16_2017_10_54_42.xml
1ZM06004_10_16_2017_11_04_14.xml
5CM10538_10_16_2017_11_04_18.xml
5XM10010_10_17_2017_08_00_47.xml
5ZM00058_10_17_2017_08_00_49.xml
5ZM05388_10_17_2017_08_01_06.xml
Upvotes: 0
Reputation: 66881
To parse and compare dates it also makes sense using a date-time module, Time::Piece here.
A naive version (see below for a more efficient one)
use warnings;
use strict;
use feature 'say';
use Time::Piece;
my @orig = (
'5CM00225_10_16_2017_10_54_42.xml',
'5CM10538_10_16_2017_11_04_18.xml',
'1ZM06004_10_16_2017_11_04_14.xml',
'5XM10010_10_17_2017_08_00_47.xml',
'5ZM05391_10_15_2017_08_51_07.xml',
'5ZM05388_10_17_2017_08_01_06.xml',
'5ZM00058_10_17_2017_08_00_49.xml',
'NMC00166_10_15_2017_08_51_06.xml',
'5CM10538_10_15_2017_08_51_06.xml',
);
my $dt = Time::Piece->new;
my @sorted = sort {
my $a_dt = $dt->strptime($a =~ /_(.*)\./, '%m_%d_%Y_%H_%M_%S');
my $b_dt = $dt->strptime($b =~ /_(.*)\./, '%m_%d_%Y_%H_%M_%S');
$a_dt <=> $b_dt
} @orig;
say for @sorted;
This runs a regex and strptime
for every comparison.
Instead, precompute them all
my @sorted =
map { $_->[1] }
sort { $a->[0] <=> $b->[0] }
map { [ $dt->strptime(/_(.*)\./, '%m_%d_%Y_%H_%M_%S'), $_ ] }
@orig;
This extracts the date-time portion of the string and builds a date-time object from it with strptime
, placing it in an arrayref together with the original string. It does this for the whole input using map
.
Then that list is passed to sort
which sorts it by its first element, where the Time::Piece
object's builtin comparison is used. Then the second map
pulls the original strings out, for our result.
Upvotes: 1
Reputation: 1818
Probably not the prettiest solution but it works:
use strict;
use warnings;
use Data::Dumper;
my @list = (
'5CM00225_10_16_2017_10_54_42.xml',
'5CM10538_10_16_2017_11_04_18.xml',
'1ZM06004_10_16_2017_11_04_14.xml',
'5XM10010_10_17_2017_08_00_47.xml',
'5ZM05391_10_15_2017_08_51_07.xml',
'5ZM05388_10_17_2017_08_01_06.xml',
'5ZM00058_10_17_2017_08_00_49.xml',
'NMC00166_10_15_2017_08_51_06.xml',
'5CM10538_10_15_2017_08_51_06.xml'
);
my @sorted = sort {
my ($mm1,$dd1,$yy1,$hh1,$min1,$ss1) = ($a =~ /_(\d{2})_(\d{2})_(\d{4})_(\d{2})_(\d{2})_(\d{2})\.xml$/);
my ($mm2,$dd2,$yy2,$hh2,$min2,$ss2) = ($b =~ /_(\d{2})_(\d{2})_(\d{4})_(\d{2})_(\d{2})_(\d{2})\.xml$/);
my $x = $yy1.$mm1.$dd1.$hh1.$min1.$ss1;
my $y = $yy2.$mm2.$dd2.$hh2.$min2.$ss2;
$x <=> $y;
} @list;
print Dumper(\@sorted);
Upvotes: 1
Reputation: 62064
This produces your desired output. It splits each line on underscore or period into a list, then only keeps the "columns" you want, in the order you want them. It keeps the year, followed by the month, day, etc. Then it joins the list elements into a new date string, then sorts lines based on dates.
use warnings;
use strict;
my @list;
while (<DATA>) {
chomp;
push @list, $_;
}
my @sorted = map { $_->[0] }
sort { $a->[1] <=> $b->[1] }
map { [$_, join '', (split /[_.]/)[3,1,2,4,5,6] ] }
@list;
__DATA__
5CM00225_10_16_2017_10_54_42.xml
5CM10538_10_16_2017_11_04_18.xml
1ZM06004_10_16_2017_11_04_14.xml
5XM10010_10_17_2017_08_00_47.xml
5ZM05391_10_15_2017_08_51_07.xml
5ZM05388_10_17_2017_08_01_06.xml
5ZM00058_10_17_2017_08_00_49.xml
NMC00166_10_15_2017_08_51_06.xml
5CM10538_10_15_2017_08_51_06.xml
I believe your code fails because it returns the list in the order they appear on the line, namely month, day, etc.
Upvotes: 4