Reputation: 10762
I am working with serialized array fields in one of my models, specifically in counting how many members of each array are shared.
Now, by the nature of my project, I am having to a HUGE number of these overlap countings.. so I was wondering if there was a super quick, cleaver way to do this.
At the moment, I am using the '&' method, so my code looks like this
(user1.follower_names & user2.follower_names).count
which works fine... but I was hoping there might be a faster way to do it.
Upvotes: 1
Views: 870
Reputation: 9764
An alternative to the above is to use the '-' operator on arrays:
user1.follower_names.size - (user1.follower_names - user2.follower_names).size
Essentially this gets the size of list one and minuses the size of the joint list without the intersection. This isn't as fast as using sets but much quicker than using intersection alone with Arrays
Upvotes: 1
Reputation: 80065
Sets are faster for this.
require 'benchmark'
require 'set'
alphabet = ('a'..'z').to_a
user1_followers = 100.times.map{ alphabet.sample(3) }
user2_followers = 100.times.map{ alphabet.sample(3) }
user1_followers_set = user1_followers.to_set
user2_followers_set = user2_followers.to_set
n = 1000
Benchmark.bm(7) do |x|
x.report('arrays'){ n.times{ (user1_followers & user2_followers).size } }
x.report('set'){ n.times{ (user1_followers_set & user2_followers_set).size } }
end
Output:
user system total real
arrays 0.910000 0.000000 0.910000 ( 0.926098)
set 0.350000 0.000000 0.350000 ( 0.359571)
Upvotes: 4