Reputation: 668
I am trying to know how many words are there in a paragraph and then find the count of each word occurrence. I could do it , but is there is any other way to do using only regex?
my $string = "John is a good boy. John goes to school with his brother Johnny. When John is hungry, he eats his tiffin.";
my @list = ();
while($string =~ /(\b\w+\b)/gi)
{
push(@list, $1);
}
my %counts;
for (@list) {
$counts{$_}++;
}
print "$#list \n";
foreach my $keys (keys %counts) {
print "$keys = $counts{$keys}\n";
}
Output should be
20
brother = 1
a = 1
goes = 1
is = 2
good = 1
to = 1
tiffin = 1
When = 1
boy = 1
his = 2
school = 1
Johnny = 1
he = 1
eats = 1
John = 3
with = 1
hungry = 1
Upvotes: 0
Views: 45
Reputation: 8142
I can't see a way to do this purely with a regex and if such a way did exist, it would be a really overly complicated regex that would be very hard to maintain. But it is possible to simplify what you have by just using a hash and losing the list;
use strict;
use warnings;
my $string = "John is a good boy. John goes to school with his brother Johnny. When John is hungry, he eats his tiffin.";
my %counts;
my $word_count = 0;
while($string =~ /\b(\w+)\b/g)
{
$counts{$1}++;
$word_count++;
}
print "$word_count\n";
foreach my $keys (keys %counts)
{
print "$keys = $counts{$keys}\n";
}
Note: I've tweaked the regex slightly as you don't need the "\b" inside the capture group and making it case-insensitive wasn't required as you're not matching specific strings. And added "use strict;" and "use warnings;" which you should always have at the top of your perl to throw up any problems with it.
Upvotes: 2