Reputation: 19

Count the articles in a Paragraph

I need to count the articles (a , an, the) in a paragraph using perl. I try but it fails

$a += scalar(split(/a./, $_));
$an += scalar(split(/\san\s/, $_));
$the += scalar(split(/the/, $_));

Upvotes: 1

Answers (3)

Borodin

Reputation: 126742

The regex that @npinti suggested will work for you, but you need to use a global pattern match in list context and convert that to a scalar.

Like this

use strict;
use warnings;

my $s = 'I need to count the articles (a , an, the) in a paragraph using perl.';

my @matches = $s =~ /\b(a|an|the)\b/g;
print scalar @matches, "\n";

output

Upvotes: 2

vks

Reputation: 67988

(?:^|(?<=\s))(?:a|an|the)(?=\s|$)

You can use this to count the articles.

Upvotes: 0

npinti

Reputation: 52185

Try using something like this: \b(a|an|the)\b (example here). This can be broken down as:

\ba\b # look for the a article.
\ban\b # look for the an article.
\bthe\b # look for the the article.

The problem with your regex is that with the exception of the an regex, you do not check to see if the article is a word within itself.

This first regex should match any a followed by any character, while the third will look for the, regardless of their location.

The \b will ensure that whatever you match, is either at the beginning of a string or else surrounded by white spaces.

Upvotes: 1

Count the articles in a Paragraph

Answers (3)

Related Questions