Reputation: 2725
With the expression below:
words = string.scan(/\b\S+\b/i)
I am trying to scan through the string with word boundaries and case insensitivity, so if I have:
string = "A ball a Ball"
then when I have this each
block:
words.each { |word| result[word] += 1 }
I am anticipating something like:
{"a"=>2, "ball"=>2}
But instead what I get is:
{"A"=>1, "ball"=>1, "a"=>1, "Ball"=>1}
After this thing didnt work I tried to create a new Regexp like:
Regexp.new(Regexp.escape(string), "i")
but then I do not know how to use this or move forward from here.
Upvotes: 2
Views: 71
Reputation: 230286
The regex matches words in case-insensitive mode, but it doesn't alter matched text in any way. So you will receive text in its original form in the block. Try casting strings to lower case when counting.
string = "A ball a Ball"
words = string.scan(/\b\S+\b/i) # => ["A", "ball", "a", "Ball"]
result = Hash.new(0)
words.each { |word| result[word.downcase] += 1 }
result # => {"a"=>2, "ball"=>2}
Upvotes: 4
Reputation: 26743
The regexp is fine; your problem is when you increment your counter using the hash. Hash keys are case sensitive, so you must change the case when incrementing:
words.each { |word| result[word.upcase] += 1 }
Upvotes: 2