Reputation: 13
Having some trouble figuring out the logic for a ruby word count. My goal is to pass in some text, and get the total count of a certain category of words as defined in an array. So if I gave the following variables, I'd want to find out the fraction of words mentioned that have anything to do with fruit:
content = "I went to the store today, and I bought apples, eggs, bananas,
yogurt, bacon, spices, milk, oranges, and a pineapple. I also had a fruit
smoothie and picked up some replacement Apple earbuds."
fruit = ["apple", "banana", "fruit", "kiwi", "orange", "pear", "pineapple", "watermelon"]
(I realize plural/singular is not consistent; just an example). Here's the code I've been trying:
content.strip
contentarray = content.downcase.split(/[^a-zA-Z]/)
contentarray.delete("")
total_wordcount = contentarray.size
IRB Test:
contentarray.grep("and")
=> ["and", "and", "and"]
contentarray.grep("and").count
=> 3
So then I try:
fruit.each do |i|
contentarray.grep(i).count
end
=> ["apple", "banana", "fruit", "kiwi", "orange", "pear", "pineapple", "watermelon"]
It just returns the array, no counts. I would add them all up after if it returned any numbers. The goal is to end up with:
fruitwordcount
=> 6 / 33
or
=> .1818181
I've tried searching and found a lot of methods saying to convert the content array to a hash count occurrences as many tutorials do, but that gives the count of every single word when I need the counts of only a subset. I can't seem to find a good way to search an array or string of words by an array of strings. I found a few articles saying to use a histogram from the Multiset gem, but that's still giving every word. Any help would be very much appreciated; please forgive my n00bery.
Upvotes: 0
Views: 141
Reputation: 10406
To get just the fruits get your array - contentarray.keep_if{|x| fruit.include?(x) }
then turn it into a hash count in the way you've found tutorials do.
Or just use inject on the contentarray
to build the hash
contentarray.inject(Hash.new(0)) do |result, element|
if fruit.include?(element)
result[element] += 1
end
result
end
Hash.new(0)
sets the default value to 0 so we can just add one
Upvotes: 0
Reputation: 1133
array#each returns the array itself as per ruby docs.
You probably want to try to give some of the other methods a try. Especially count and map look promising:
fruit.map do |f|
contentarray.count{|content| content == f}
end
Upvotes: 0
Reputation: 1681
It's because the each
method just iterates and executes the block. Use map
or collect
to execute the block and return an array.
result = fruit.map { |i| counterarray.grep(i).count }
Upvotes: 0
Reputation: 121000
Fruit#each
just iterates the fruits, while you likely want to collect value. map
comes to the rescue:
result = fruit.map do |i|
[i, contentarray.grep(i).count]
end
Whether you need a hash of fruit ⇒ count
, it’s simple:
result = Hash[result]
Hope it helps.
Upvotes: 1
Reputation: 7262
The method you are looking for is map
, not each
: each
executes the block for each element in the array, and then returns the original array. map
creates a new array containing the values returned by the block.
fruit.map do |i|
contentarray.grep(i).count
end
=> [1, 0, 1, 0, 0, 0, 1, 0]
Upvotes: 0