Reputation: 34513
We understand there is no fixed rule on how many items Ruby hashes can accommodate before losing constant time access, but we're hoping someone can share some advice.
We are storing 800K keys in a Ruby hash, and assigning them all a boolean value of true. That's it.
Each lookup seems to take a few seconds.
Should Ruby hashes exhibit constant time lookup with 800K keys?
Is there a threshold or rule-of-thumb for when to expect performance degradation with large hashes? We would love to hear from a Ruby expert.
Thanks!
Upvotes: 4
Views: 1954
Reputation: 27207
As well as converting your strings to symbols using s.to_sym, provided you are not creating or destroying hash keys during a sequence of multiple lookups, you could also call GC.disable before and GC.enable after a set of lookups. This will temporarily disable garbage collection, and it is relatively safe to do this if you are running e,g. a simple loop that doesn't create or delete large numbers of objects.
Ruby performance can degrade significantly when the count of (Ruby) objects in memory gets into the millions. Some of the performance loss is the time it takes the garbage collection routines to take stock.
Upvotes: 2