mu_sa
mu_sa

Reputation: 2725

Regular expression and String

With the expression below:

words = string.scan(/\b\S+\b/i)

I am trying to scan through the string with word boundaries and case insensitivity, so if I have:

string = "A ball a Ball" 

then when I have this each block:

words.each { |word| result[word] += 1 }

I am anticipating something like:

{"a"=>2, "ball"=>2}

But instead what I get is:

{"A"=>1, "ball"=>1, "a"=>1, "Ball"=>1}

After this thing didnt work I tried to create a new Regexp like:

Regexp.new(Regexp.escape(string), "i")

but then I do not know how to use this or move forward from here.

Upvotes: 2

Views: 71

Answers (2)

Sergio Tulentsev
Sergio Tulentsev

Reputation: 230286

The regex matches words in case-insensitive mode, but it doesn't alter matched text in any way. So you will receive text in its original form in the block. Try casting strings to lower case when counting.

string = "A ball a Ball" 
words = string.scan(/\b\S+\b/i) # => ["A", "ball", "a", "Ball"]

result = Hash.new(0)
words.each { |word| result[word.downcase] += 1 } 
result # => {"a"=>2, "ball"=>2}

Upvotes: 4

The regexp is fine; your problem is when you increment your counter using the hash. Hash keys are case sensitive, so you must change the case when incrementing:

words.each { |word| result[word.upcase] += 1 } 

Upvotes: 2

Related Questions