Reputation: 787
I am trying to optimize ZRANGEBYSCORE in my Redis code in Ruby.
Specifically, the Redis website (http://redis.io/commands/zrangebyscore) states:
Time complexity: O(log(N)+M) with N being the number of elements in the sorted set and M the number of elements being returned. If M is constant (e.g. always asking for the first 10 elements with LIMIT), you can consider it O(log(N)).
So, as I read it, as long as I am using the limits, the big(O) should be constant at O(log(N) regardless of if the limit is set to 48 or 6. However, my benchmarking seems to suggest otherwise.
require 'redis'
def bench(descr)
start = Time.now
yield
puts "#{descr} #{Time.now-start} seconds"
end
def with_pipelining_48
id = 26053643
@@redis.pipelined {
1000.times {
@@redis.zrevrangebyscore("key:#{id}", "+inf", "-inf", :limit => [0, 48],:with_scores => true)
}
}
end
def with_pipelining_24
id = 26053643
@@redis.pipelined {
1000.times {
@@redis.zrevrangebyscore("key:#{id}", "+inf", "-inf", :limit => [0, 24],:with_scores => true)
}
}
end
def with_pipelining_12
id = 26053643
@@redis.pipelined {
1000.times {
@@redis.zrevrangebyscore("key:#{id}", "+inf", "-inf", :limit => [0, 12],:with_scores => true)
}
}
end
def with_pipelining_6
id = 26053643
@@redis.pipelined {
1000.times {
@@redis.zrevrangebyscore("key:#{id}", "+inf", "-inf", :limit => [0, 6],:with_scores => true)
}
}
end
bench("with pipelining_48") {
with_pipelining_48
}
bench("with pipelining_24") {
with_pipelining_24
}
bench("with pipelining_12") {
with_pipelining_12
}
bench("with pipelining_6") {
with_pipelining_6
}
Returns the following results:
with pipelining_48 1.709097 seconds
with pipelining_24 0.930054 seconds
with pipelining_12 0.801045 seconds
with pipelining_6 0.633037 seconds
My results appear inconsistent with the Redis documentation. Can someone please shed some light on what may be causing the differences.
Upvotes: 2
Views: 480
Reputation: 39009
So, first, for a true benchmark, you should take the average of your operations after throwing out outliers. The differences between all of these operations are in milliseconds, so if you just have a tiny network hiccup, or something strange happen on your box, it's going to throw the numbers off. If you do that, your numbers will probably look different.
Second, when you run an operation against Redis, there's a lot more happening than just the operation cost. A connection to Redis needs to be made along with other start-up costs associated. The data needs to be transferred as well, and there's going to be network back-and-forth. What you're timing is not just the cost of the operation, but all of these auxiliary costs as well. These are also going to be non-linear as different formats for holding the data may be used depending on data size. All in all, the main point is that zrevrangebyscore
is O(log(n)+m), but there are auxiliary costs involved in the operation as well, and at low operation cost (which is almost always the case with Redis), those costs dwarf the actual costs of just the operation. This is a good thing.
Upvotes: 1
Reputation: 50112
I believe you misunderstood the meaning of Big(O). Big(O) is a complexity measure and, as Redis' documentation states, by keeping M constant you can discard its "contribution" to the complexity (basically, O(constant) = O(1)). That means that if you're comparing such operations, as long as M is kept the same, only N plays a role.
However, this does not mean that performance is the same. Returning 6 elements or 24 differs because you're moving different volumes of data (regardless of complexity).
Upvotes: 0