Reputation: 69
I am quite new to Perl, and I am writing a Perl script. A part of my script counts the number of times each word is appearing in the text file. THIS COUNTING REPEATS AFTER SPECIFIC INTERVALS, SO I NEED AN ARRAY FOR EACH OF THAT REPEATING SEQUENCE. I have the code to count the number of words, BUT FOR JUST ONE SEQUENCE.
for (@array) {
$counts{$_}++;
print "\'$_\'\t";
}
My trouble is that I need to create an array for the hash "counts".
EDIT: By ARRAY I mean that I should be able to store the repetition for each word for each particular section of the text file. I JUST NEED TO DETERMINE THE PARTIAL COUNT FOR EACH SECTION IN THE TEXT FILE. This is what my text file looks like:!
i HAVE uploaded an image to describe in details
Upvotes: 0
Views: 6015
Reputation: 107080
The great thing about Perl is that there's no need to initialize a hash or an array, you simply create one.
You say you're a new Perl user, but you seem to know about references. You can read an excellent tutorial right inside the Perl documentation. You can do this by using the perldoc
command from your command line.
That said, and looking at your application, I can see several different types of data structures:
The code would look something like this:
my $section_number = -1; #We'll increment this to one on the first section number
my @data; #This is an array where you'll store your sections
while (my $line = <$my_file>) {
chomp $line;
if ($line =~ /^>this is my \w+ statement$/) {
$section_number++;
$data[$section_number] = {}; #A reference to a hash
}
else {
$data[$section_number]->{$line}++;
}
}
The first part of the if statement is merely incrementing the section count, so that each parameter is stored in a different section. This is nice if the question is In section #x, how many times did you see Parameter "y"?.
The code would look something like this:
my $section_number = -1; #We'll increment this to one on the first section number
my %data; #This is an array where you'll store your sections
while (my $line = <$my_file>) {
chomp $line;
if ($line =~ /^>this is my \w+ statement$/) {
$section_number++;
}
else {
if (not exists $data{$line}) {
$data{$line} = []; #This hash will contain a ref to an array
}
$data{$line}->[$section_number]++;
}
}
Another possibilities is to use a Hash of Hashes that TLP showed.
The point is that when you are talking about structure that contains more than mere scalar data, you need to use references.
How you want to construct your data structure is really up to you and depends upon what you want to track and how you want to access that data. As shown in this one question, there are at least three different ways you could structure your data. And, building this complex data structure is fairly easy. And, there's really nothing to initialize.
Once you understand references, your data structure can be as complex as you dare (although I suggest to start looking into object oriented Perl coding techniques before you really go wild with them).
By the way, none of the answers mentioned how you'd access your data besides using Data::Dumper, but a simple loop would be sufficient. This is for an array of hashes:
my $section = 0;
while ($section <= $#data) {
my %param_hash = %{$data[$section]};
foreach my $parameter (sort keys %param_hash) {
print "In section $section: $parameter appears $param_hash{$parameter} times\n";
}
$section++;
}
Upvotes: 2
Reputation: 126742
Build an anonymous hash of word counts. At the end of each section push the hash onto the array and start a new anonymous hash. The code below implements this. (The call to Data::Dumper
is there only to demonstrate the data structure that has been built.)
use strict;
use warnings;
my $sect;
my @counts;
while (<DATA>) {
if (/^(\w+)/) {
$sect->{$1}++;
}
elsif ($sect) {
push @counts, $sect;
undef $sect;
}
}
use Data::Dumper;
$Data::Dumper::Sortkeys = 1;
print Data::Dumper->Dump([\@counts], ['*counts']);
__DATA__
--------------------
>this is my first statement
Parameter1
Parameter2
Parameter3
Parameter2
--------------------
>this is my second statement
Parameter1
Parameter2
Parameter3
--------------------
>this is my third statement
Parameter1
Parameter2
Parameter2
Parameter3
--------------------
>this is my fourth statement
Parameter1
Parameter2
--------------------
>this is my fifth statement
Parameter1
Parameter2
Parameter3
Parameter4
--------------------
OUTPUT
@counts = (
{
'Parameter1' => 1,
'Parameter2' => 2,
'Parameter3' => 1
},
{
'Parameter1' => 1,
'Parameter2' => 1,
'Parameter3' => 1
},
{
'Parameter1' => 1,
'Parameter2' => 2,
'Parameter3' => 1
},
{
'Parameter1' => 1,
'Parameter2' => 1
},
{
'Parameter1' => 1,
'Parameter2' => 1,
'Parameter3' => 1,
'Parameter4' => 1
}
);
Upvotes: 0
Reputation: 67910
I'm not exactly sure what you are asking here, but a good way to start is perhaps to simply add all your data to a hash, and then extract what data you need from that hash.
use strict;
use warnings;
use Data::Dumper;
my %count;
my $section;
while (<DATA>) {
chomp;
if (/^section/) { # some check to tell sections apart
$section = $_;
} else {
$count{$section}{$_}++;
}
}
print Dumper \%count; # see what your structure looks like
my @array = values %count; # if you don't like hashes
__DATA__
section1
param1
param2
param2
param3
section2
param1
param2
param3
param1
section3
param4
param1
param1
param2
section4
param1
param3
Upvotes: 1