BananaNeil
BananaNeil

Reputation: 10762

Ruby Array - quick way to count overlap

I am working with serialized array fields in one of my models, specifically in counting how many members of each array are shared.

Now, by the nature of my project, I am having to a HUGE number of these overlap countings.. so I was wondering if there was a super quick, cleaver way to do this.

At the moment, I am using the '&' method, so my code looks like this

(user1.follower_names & user2.follower_names).count

which works fine... but I was hoping there might be a faster way to do it.

Upvotes: 1

Views: 870

Answers (2)

Yule
Yule

Reputation: 9764

An alternative to the above is to use the '-' operator on arrays:

user1.follower_names.size - (user1.follower_names - user2.follower_names).size

Essentially this gets the size of list one and minuses the size of the joint list without the intersection. This isn't as fast as using sets but much quicker than using intersection alone with Arrays

Upvotes: 1

steenslag
steenslag

Reputation: 80065

Sets are faster for this.

require 'benchmark'
require 'set'
alphabet = ('a'..'z').to_a
user1_followers = 100.times.map{ alphabet.sample(3) }
user2_followers = 100.times.map{ alphabet.sample(3) }
user1_followers_set = user1_followers.to_set
user2_followers_set = user2_followers.to_set

n = 1000
Benchmark.bm(7) do |x|
  x.report('arrays'){ n.times{ (user1_followers & user2_followers).size } }
  x.report('set'){ n.times{ (user1_followers_set & user2_followers_set).size } }
end

Output:

              user     system      total        real
arrays    0.910000   0.000000   0.910000 (  0.926098)
set       0.350000   0.000000   0.350000 (  0.359571)

Upvotes: 4

Related Questions