Reputation: 61
I have gotten to the point where I can split and count sentences with simple end of sentence punctuation like ! ? .
However, I need it to work for complex sentences such as:
"Learning Ruby is a great endeavor!!!! Well, it can be difficult at times..."
Here you can see the punctuation repeats itself.
What I have so far, that works with simple sentences:
def count_sentences
sentence_array = self.split(/[.?!]/)
return sentence_array.count
end
Thank you!
Upvotes: 2
Views: 1586
Reputation: 110675
class String
def count_sentences
scan(/[.!?]+(?=\s|\z)/).size
end
end
str = "Learning Ruby is great!!!! The course cost $2.43... How much??!"
str.count_sentences
#=> 3
(?=\s|\z)/)
is a positive lookahead, requiring the match to be immediately followed by a whitespace character or the end of the string.
Upvotes: 3
Reputation: 22225
String#count might be easiest.
"Who will treat me to a beer? I bet, alexnewby will!".count('.!?')
Compared to tadman's solution, no intermediate array needs to be constructed. However it yields incorrect results if, for instance, a run of periods or exclamation mark is found in the string:
"Now thinking .... Ah, that's it! This is what we have to do!!!".count('.!?')
=> 8
The question therefore is: Do you need absolute, exact results, or just approximate ones (which might be sufficient, if this is used for statistical analysis of, say, large printed texts)? If you need exact results, you need to define, what is a sentence, and what is not. Think about the following text - how many sentences are in it?
Louise jumped out of the ground floor window.
"Stop! Don't run away!", cried Andy. "I did not
want to eat your chocolate; you have to believe
me!" - and, after thinking for a moment, he
added: "If you come back, I'll buy you a new
one! Large one! With hazelnuts!".
BTW, even tadman's solution is not exact. It would give a count of five for the following single sentence:
The IP address of Mr. Sloopsteen's dishwasher is 192.168.101.108!
Upvotes: 1
Reputation: 211570
It's pretty easy to adapt your code to be a little more forgiving:
def count_sentences
self.split(/[.?!]+/).count
end
There's no need for the intermediate variable or return
.
Note that empty strings will also be caught up in this, so you may want to filter those out:
test = "This is junk! There's a space at the end! "
That would return 3
with your code. Here's a fix for that:
def count_sentences
self.split(/[.?!]+/).grep(/\S/).count
end
That will select only those strings that have at least one non-space character.
Upvotes: 3