Reputation: 1082
This is my code for calculate word frequency
word_arr= ["I", "received", "this", "in", "email", "and", "found", "it", "a", "good", "read", "to", "share......", "Yes,", "Dr", "M.", "Bakri", "Musa", "seems", "to", "know", "what", "is", "happening", "in", "Malaysia.", "Some", "of", "you", "may", "know.", "He", "is", "a", "Malay", "extra horny", "horny nor", "nor their", "their babes", "babes are", "are extra", "extra SEXY..", "SEXY.. .", ". .", ". .It's", ".It's because", "because their", "their CONDOMS", "CONDOMS are", "are Made", "Made In", "In China........;)", "China........;) &&"]
arr_stop_kwd=["a","and"]
frequencies = Hash.new(0)
word_arr.each { |word|
if !arr_stop_kwd.include?(word.downcase) && !word.match('&&')
frequencies["#{word.downcase}"] += 1
end
}
when i have 100k data it will take 9.03 seconds,that,s to much time can i calculate any another way
Thx in advance
Upvotes: 0
Views: 472
Reputation: 9049
Take a look at Facets gem
You can do something like this using the frequency method
require 'facets'
frequencies = (word_arr-arr_stop_kwd).frequency
Note that stop word can be subtracted from the word_arr
. Refer to Array Documentation.
Upvotes: 2