Reputation: 31
Here I got a text file contains some subject results. What is the possible way to do a counting table as below in Perl?
Math Peter pass
English Peter pass
Music Peter fail
Science Peter fail
Art Mary fail
Music Mary fail
English Mary fail
Math Bob pass
English Bob fail
Art Bob pass
Music Bob fail
English Mike pass
Science Mike pass
name pass fail
Peter 2 2
Mary 0 3
Bob 2 2
Mike 2 0
I have already tried this and can successfully print a dump in spited form
#!/usr/bin/perl
use strict;
use Data::Dumper;
my $CurrentPath = "/tmp";
open(FILE, "/tmp/result.txt") or die("Cannot open file result.txt for reading: $!");
my @results = <FILE>;
s{^\s+|\s+$}{}g foreach @results;
close FILE;
my @data_split = ();
foreach my $result ( @results ) {
push @data_split, [ split /\s+/, $result ];
}
print Dumper \@data_split;
1;
$VAR1 = [
[
'Math',
'Peter',
'pass'
],
[
'Eng',
'Peter',
'pass'
],
...............
Upvotes: 1
Views: 108
Reputation: 66964
To count and manage a range of items, and for any structured information, hashes are very useful. With them and arrays you can via references build data structures that can encode rather complex relationships. See perldsc. Once these become unwieldy the next step is to write a class.
It can go along these lines
use warnings;
use strict;
use feature 'say';
use Data::Dump qw(dd);
my $file = shift @ARGV;
die "Usage: $0 filename\n" if not $file;
open my $fh, '<', $file or die "Can't open $file: $!";
my %results;
while (<$fh>) {
next if not /\S/; # skip empty lines
my ($subj, $name, $grade) = split;
if (not $subj or not $name or not defined $grade) {
warn "Incomplete data, line: $_";
next;
}
if ($grade eq 'pass') {
$results{$name}->{pass}++;
}
elsif ($grade eq 'fail') {
$results{$name}->{fail}++;
}
else {
warn "Unknown grade format for $name in $subj: $grade";
next;
}
}
dd \%results;
Names without a single occurrence of a particular score stay without the hashref for that score. If you need those entries then post-process %results
to add them, for example
foreach my $name (keys %results) {
$results{$name}->{pass} = 0 if not exists $results{$name}->{pass};
$results{$name}->{fail} = 0 if not exists $results{$name}->{fail};
}
Or, add a statement to initialize both scores for each record (name) in the code.
Note that we can enlarge this data structure to store more information ($subj
for instance), as the need arises, cleanly and with small code changes. This is another benefit of using hashes.
A few comments on the posted code
Why is there no use warnings;
at the beginning? You must have that; it is directly useful
Use lexical filehandles, open my $fh, '<', $file ...
instead of globs (FILE
)
Process files a line at a time unless there is a specific reason to read the whole thing first
You always want to check input against what you expect it to be; what exactly that is depends on your problem and designs. In the code above all fields must be defined and sensible (grade is allowed a 0), while you may in fact accept a non-existent grade; adjust as suitable.
Of course, if all files that this program will ever read were like what you show, always with all fields and only pass/fail grade, then there would be no need to check input
The pattern /\s+/
in split should almost always be replaced with ' '
, which is the same but it also discards leading spaces. It is also the default, along with $_
for the string, thus just split;
above (the string it splits by ' '
is $_
)
You can built a data structure like the one used here with what you got as well. However I don't see a benefit of an array of arrays here
The increment of the appropriate counter can be written more concisely
while (<$fh>) {
next if not /\S/;
my ($subj, $name, $grade) = split;
# check input ...
if ($grade !~ /^(?:pass|fail)$/) {
warn "Unknown grade format for $name in $subj: $grade";
next;
}
$results{$name}->{$grade}++;
}
If you'd rather have your code quietly accept anything at the third field and store it in %results
with its count then remove the check against pass|fail
.
Upvotes: 2
Reputation: 126772
This is a very basic solution that does no checking of the input data. It initialises each hash value to { pass => 0, fail => 0 }
so that there are no "missing" values that need to be defaulted
Note that Perl hashes are unordered, so the order of the output is also indeterminate. If you need anything specific then you must say so
use strict;
use warnings 'all';
use feature 'say';
open my $fh, '<', 'results.txt' or die $!;
my %grades;
while ( <$fh> ) {
my ($class, $name, $grade) = split;
$grades{$name} //= { pass => 0, fail => 0 };
++$grades{$name}{$grade}
}
say "name\tpass\tfail";
for ( keys %grades ) {
say join "\t", $_, @{ $grades{$_} }{qw/ pass fail /};
}
name pass fail
Mary 0 3
Bob 2 2
Mike 2 0
Peter 2 2
Upvotes: -1