Reputation: 19
I need to count the articles (a , an, the) in a paragraph using perl. I try but it fails
$a += scalar(split(/a./, $_));
$an += scalar(split(/\san\s/, $_));
$the += scalar(split(/the/, $_));
Upvotes: 1
Views: 87
Reputation: 126742
The regex that @npinti
suggested will work for you, but you need to use a global pattern match in list context and convert that to a scalar.
Like this
use strict;
use warnings;
my $s = 'I need to count the articles (a , an, the) in a paragraph using perl.';
my @matches = $s =~ /\b(a|an|the)\b/g;
print scalar @matches, "\n";
output
5
Upvotes: 2
Reputation: 67988
(?:^|(?<=\s))(?:a|an|the)(?=\s|$)
You can use this to count the articles.
Upvotes: 0
Reputation: 52185
Try using something like this: \b(a|an|the)\b
(example here). This can be broken down as:
\ba\b
# look for the a article.\ban\b
# look for the an article.\bthe\b
# look for the the article.The problem with your regex is that with the exception of the an
regex, you do not check to see if the article is a word within itself.
This first regex should match any a
followed by any character, while the third will look for the
, regardless of their location.
The \b
will ensure that whatever you match, is either at the beginning of a string or else surrounded by white spaces.
Upvotes: 1