Reputation: 3
I am trying to understand the logic of the following script specially in terms of storing content within the hash and time scan, also any suggestion on the improvement to make it more short.
#!perl
use strict;
use warnings;
my $A = 60; # minutes
my @mth = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
my @f = localtime();
my $TODAY = sprintf "%02d/%s/%4d",$f[3],$mth[$f[4]],$f[5]+1900;
my $START_MINUTE = $f[2]*60+$f[1] - $MAX_AGE;
##
my %users;
my %conn;
while (<DATA>) {
if( /\bAT\b/ ) {
my( $conn, $uid ) = /conn=(\d+).*uid=(.*?),/;
$conn{$conn} = $uid;
}
if( /ABB/ ) {
my ($timestamp, $conn) = /\[(.*?)\] conn=(\d+)/;
my ($date,$h,$m,undef) = split ':',$timestamp,4;
next unless ($date eq $TODAY);
my $minutes = $h*60 + $m;
if ($minutes >= $START_MINUTE){
my $uid = $conn{$conn};
++$users{$uid};
}
}
}
for my $uid (keys %users) {
my $count = $users{$uid};
print "$count\n" if $count > 6;
}
_DATA_
[04/Jun/2013:13:06:13 -0600] conn=13570 op=14 msgId=13 - AT dn="conn=ad1222,o=xyz.com" method=128 version=3
[04/Jun/2013:15:06:13 -0600] conn=13570 op=14 msgId=15 - RESULT ABB
Upvotes: 0
Views: 125
Reputation: 34357
There are two places where data is put in a hash
my( $conn, $uid ) = /conn=(\d+).*uid=(.*?),/;
$conn{$conn} = $uid;
}
This is straight forward, the regexp extracts the $uid and $conn and sets a hash entry with $conn as the key and $uid as the value. In this statement
$conn{$conn}
^^^^^^ ^ this is a hash
^^^^^ this is a completely different scalar
Overall the expression $conn{$conn}
refers to a single element of the hash %conn with the scalar key $conn. There are two different variables here with basically the same name!
If you are looking for improvements, stylistically the hash should be called %uid as it's values are uids
if ($minutes >= $START_MINUTE){
my $uid = $conn{$conn};
++$users{$uid};
This is a bit more "that crazy perl" stuff, although really it is straight forward and is widely used in code. All it it does is increment the hash entry for the key $uid. If there is no entry for $user{$uid} already then the statement automatically makes it and sets the value to 1
update to discuss the "time scan"
my $A = 60; # minutes
my @mth = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
my @f = localtime();
my $TODAY = sprintf "%02d/%s/%4d",$f[3],$mth[$f[4]],$f[5]+1900;
my $START_MINUTE = $f[2]*60+$f[1] - $MAX_AGE;
This makes "$TODAY" which is the date today in a format that matches dates in the files and $START_MINUTE which is the number of minutes since midnight at the time the script is run
Later in the script the time of day is extracted and the minutes since midnight are found in a similar way (hour * 60 + minutes)
To "improve" this part of the script strftime could be used instead of the @mth array and the sprintf line
The calculations for the minutes could be moved to a sub called something like sub minutes_since_midnight
Bit difficult to say about improving the use of the hashes as it's not clear what they are used for out of the context of the program segment shown
Hope that more or less answers your question!!
Upvotes: 1
Reputation: 107050
any suggestion on the improvement to make it more short
You don't want to make it shorter to improve it. You need make it easier to understand in order to improve it. You already had problems understanding the logic, making it shorter isn't going to help there.
Let's take this:
my $A = 60; # minutes
my @mth = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
my @f = localtime();
my $TODAY = sprintf "%02d/%s/%4d",$f[3],$mth[$f[4]],$f[5]+1900;
my $START_MINUTE = $f[2]*60+$f[1] - $MAX_AGE;
I could go through the logic and attempt to figure it out, but more likely, I'll be making the same logical assumptions of the original author. Instead, we can improve readability by using good variable names, and by expanding the logic:
my @month_list = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec);
my ( $sec, $min, $hour, $month, $mday, $year ) = localtime; #Whoops!
my $full_year += 1900;
my $text_month = $month_list[$month];
my $today = sprintf "%02d/%s/%4d", $mday, $text_month, $full_year;
This is longer to type, but efficiency wise, it's just as efficient. Just because you can cram a bunch of operations on a single line doesn't make it faster to execute. However, mine is much easier to read and easier to maintain which will save you many hours of work. For example, my parsing of localtime
is taken directly from the Perldoc on localtime. If you find an issue, and you think it could be due to my parsing of localtime, you could quickly compare my code to the Perldoc.
In fact, there is an error. Take a look of the localtime
documentation and compare it to what I have, and you'll see I have $month
and $mday
mixed up.
Even better would be to use Time::Piece. In fact, Time::Piece
would have made parsing the timestamp much cleaner too.
So, please understand that shorter code isn't better if it's just harder to understand, and it is usually not any more efficient in execution.
Upvotes: 0