Reputation: 1663
I have the following Array = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]
How do I produce a count for each identical element?
Where:
"Jason" = 2, "Judah" = 3, "Allison" = 1, "Teresa" = 1, "Michelle" = 1?
or produce a hash Where:
Where: hash = { "Jason" => 2, "Judah" => 3, "Allison" => 1, "Teresa" => 1, "Michelle" => 1 }
Upvotes: 136
Views: 96791
Reputation: 28305
As of ruby v2.7.0 (released December 2019), the core language now includes Enumerable#tally
- a new method, designed specifically for this problem:
names = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]
names.tally
#=> {"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}
The following code was not possible in standard ruby when this question was first asked (February 2011), as it uses:
Object#itself
, which was added to Ruby v2.2.0 (released December 2014).Hash#transform_values
, which was added to Ruby v2.4.0 (released December 2016).These modern additions to Ruby enable the following implementation:
names = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]
names.group_by(&:itself).transform_values(&:count)
#=> {"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}
For even older ruby versions, without access to the above mentioned Hash#transform_values
method, you could instead use Array#to_h
, which was added to Ruby v2.1.0 (released December 2013):
names.group_by(&:itself).map { |k,v| [k, v.length] }.to_h
#=> {"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}
For even older ruby versions (<= 2.1
), there are several ways to solve this, but (in my opinion) there is no clear-cut "best" way. See the other answers to this post.
Upvotes: 201
Reputation: 12203
Ruby 2.7+
Ruby 2.7 is introducing Enumerable#tally
for this exact purpose. There's a good summary here.
In this use case:
array.tally
# => { "Jason" => 2, "Judah" => 3, "Allison" => 1, "Teresa" => 1, "Michelle" => 1 }
Docs on the features being released are here.
Upvotes: 25
Reputation: 815
With ruby 2.6 you can do:
names.to_h{ |name| [name, names.count(name)] }
gives you:
{"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}
Upvotes: 2
Reputation: 4038
Enumberable#each_with_object
saves you from returning the final hash.
names.each_with_object(Hash.new(0)) { |name, hash| hash[name] += 1 }
Returns:
=> {"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}
Upvotes: 14
Reputation: 841
Lots of great implementations here.
But as a beginner I would consider this the easiest to read and implement
names = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]
name_frequency_hash = {}
names.each do |name|
count = names.count(name)
name_frequency_hash[name] = count
end
#=> {"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}
The steps we took:
names
arraynames
arrayname
and a value using the count
It may be slightly more verbose (and performance wise you will be doing some unnecessary work with overriding keys), but in my opinion easier to read and understand for what you want to achieve
Upvotes: 2
Reputation: 486
Now using Ruby 2.2.0 you can leverage the itself
method.
names = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]
counts = {}
names.group_by(&:itself).each { |k,v| counts[k] = v.length }
# counts > {"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}
Upvotes: 28
Reputation: 2299
a = [1, 2, 3, 2, 5, 6, 7, 5, 5]
a.each_with_object(Hash.new(0)) { |o, h| h[o] += 1 }
# => {1=>1, 2=>2, 3=>1, 5=>3, 6=>1, 7=>1}
Credit Frank Wambutt
Upvotes: 5
Reputation: 3382
arr = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]
arr.uniq.inject({}) {|a, e| a.merge({e => arr.count(e)})}
Time elapsed 0.028 milliseconds
interestingly, stupidgeek's implementation benchmarked:
Time elapsed 0.041 milliseconds
and the winning answer:
Time elapsed 0.011 milliseconds
:)
Upvotes: 1
Reputation: 118299
names = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]
Hash[names.group_by{|i| i }.map{|k,v| [k,v.size]}]
# => {"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}
Upvotes: 5
Reputation: 81671
The following is a slightly more functional programming style:
array_with_lower_case_a = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]
hash_grouped_by_name = array_with_lower_case_a.group_by {|name| name}
hash_grouped_by_name.map{|name, names| [name, names.length]}
=> [["Jason", 2], ["Teresa", 1], ["Judah", 3], ["Michelle", 1], ["Allison", 1]]
One advantage of group_by
is that you can use it to group equivalent but not exactly identical items:
another_array_with_lower_case_a = ["Jason", "jason", "Teresa", "Judah", "Michelle", "Judah Ben-Hur", "JUDAH", "Allison"]
hash_grouped_by_first_name = another_array_with_lower_case_a.group_by {|name| name.split(" ").first.capitalize}
hash_grouped_by_first_name.map{|first_name, names| [first_name, names.length]}
=> [["Jason", 2], ["Teresa", 1], ["Judah", 3], ["Michelle", 1], ["Allison", 1]]
Upvotes: 6
Reputation: 81671
This is more a comment than an answer, but a comment wouldn't do it justice. If you do Array = foo
, you crash at least one implementation of IRB:
C:\Documents and Settings\a.grimm>irb
irb(main):001:0> Array = nil
(irb):1: warning: already initialized constant Array
=> nil
C:/Ruby19/lib/ruby/site_ruby/1.9.1/rbreadline.rb:3177:in `rl_redisplay': undefined method `new' for nil:NilClass (NoMethodError)
from C:/Ruby19/lib/ruby/site_ruby/1.9.1/rbreadline.rb:3873:in `readline_internal_setup'
from C:/Ruby19/lib/ruby/site_ruby/1.9.1/rbreadline.rb:4704:in `readline_internal'
from C:/Ruby19/lib/ruby/site_ruby/1.9.1/rbreadline.rb:4727:in `readline'
from C:/Ruby19/lib/ruby/site_ruby/1.9.1/readline.rb:40:in `readline'
from C:/Ruby19/lib/ruby/1.9.1/irb/input-method.rb:115:in `gets'
from C:/Ruby19/lib/ruby/1.9.1/irb.rb:139:in `block (2 levels) in eval_input'
from C:/Ruby19/lib/ruby/1.9.1/irb.rb:271:in `signal_status'
from C:/Ruby19/lib/ruby/1.9.1/irb.rb:138:in `block in eval_input'
from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:189:in `call'
from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:189:in `buf_input'
from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:103:in `getc'
from C:/Ruby19/lib/ruby/1.9.1/irb/slex.rb:205:in `match_io'
from C:/Ruby19/lib/ruby/1.9.1/irb/slex.rb:75:in `match'
from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:287:in `token'
from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:263:in `lex'
from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:234:in `block (2 levels) in each_top_level_statement'
from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:230:in `loop'
from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:230:in `block in each_top_level_statement'
from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:229:in `catch'
from C:/Ruby19/lib/ruby/1.9.1/irb/ruby-lex.rb:229:in `each_top_level_statement'
from C:/Ruby19/lib/ruby/1.9.1/irb.rb:153:in `eval_input'
from C:/Ruby19/lib/ruby/1.9.1/irb.rb:70:in `block in start'
from C:/Ruby19/lib/ruby/1.9.1/irb.rb:69:in `catch'
from C:/Ruby19/lib/ruby/1.9.1/irb.rb:69:in `start'
from C:/Ruby19/bin/irb:12:in `<main>'
C:\Documents and Settings\a.grimm>
That's because Array
is a class.
Upvotes: 1
Reputation: 8757
This works.
arr = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]
result = {}
arr.uniq.each{|element| result[element] = arr.count(element)}
Upvotes: 6
Reputation: 369604
There's actually a data structure which does this: MultiSet
.
Unfortunately, there is no MultiSet
implementation in the Ruby core library or standard library, but there are a couple of implementations floating around the web.
This is a great example of how the choice of a data structure can simplify an algorithm. In fact, in this particular example, the algorithm even completely goes away. It's literally just:
Multiset.new(*names)
And that's it. Example, using https://GitHub.Com/Josh/Multimap/:
require 'multiset'
names = %w[Jason Jason Teresa Judah Michelle Judah Judah Allison]
histogram = Multiset.new(*names)
# => #<Multiset: {"Jason", "Jason", "Teresa", "Judah", "Judah", "Judah", "Michelle", "Allison"}>
histogram.multiplicity('Judah')
# => 3
Example, using http://maraigue.hhiro.net/multiset/index-en.php:
require 'multiset'
names = %w[Jason Jason Teresa Judah Michelle Judah Judah Allison]
histogram = Multiset[*names]
# => #<Multiset:#2 'Jason', #1 'Teresa', #3 'Judah', #1 'Michelle', #1 'Allison'>
Upvotes: 17
Reputation: 5853
names.inject(Hash.new(0)) { |total, e| total[e] += 1 ;total}
gives you
{"Jason"=>2, "Teresa"=>1, "Judah"=>3, "Michelle"=>1, "Allison"=>1}
Upvotes: 133
Reputation: 124469
names = ["Jason", "Jason", "Teresa", "Judah", "Michelle", "Judah", "Judah", "Allison"]
counts = Hash.new(0)
names.each { |name| counts[name] += 1 }
# => {"Jason" => 2, "Teresa" => 1, ....
Upvotes: 95