Reputation: 111
I have key-values like following example:
KEY VALUE
key1 1
key2 2
key3 3
. .
. .
keyN N
Each of my key needs to map a unique number so I am mapping my keys to auto incremented numbers then inserting it to Redis via redis mass insertion which works very well and then using GET command for internal processing of all the key value mappings.
But I have more than 1 billion key so I was wondering is there even more efficient (mainly lesser memory usage) way for using Redis for this scenario?
Upvotes: 2
Views: 4990
Reputation: 11
I'm doing exactly the same thing. here is an simple example. if you have a better one, welcome to discuss :)
import redis
pool = redis.ConnectionPool(host=your_host, port=your_port)
r = redis.Redis(connection_pool=pool)
def my_incr(pipe):
next_value = pipe.hlen('myhash')
pipe.multi()
pipe.hsetnx(
name='myhash',
key=newkey, value=next_value
)
pipe = r.pipeline()
newkey = 'key1'
r.transaction(my_incr, 'myhash')
Upvotes: 1
Reputation: 111
I would like to answer my own question.
If you have sorted key values, the most efficient way to bulk insert and then read them is using a B-Tree based database.
For instance, with MapDB I am able to insert it very quickly and it takes up less memory.
Upvotes: 0
Reputation: 10112
The auto-increment key allows a unique number to be generated when a new record is inserted into a table/redis.
There is other way using UUID.
But I think auto-increment is far better due to reason like it need four time more space, ordering cannot be done based on key,etc
Upvotes: 1
Reputation: 23031
In order to be more memory efficient, you can use HASH
to store these key-value pairs. Redis has special encoding for small HASH
. It can save you lots of memory.
In you case, you can shard your keys into many small HASH
s, each HASH
has less than hash-max-ziplist-entries
entries. See the doc for details.
B.T.W, with the INCR
command, you can use Redis to create auto-incremented numbers.
Upvotes: 0
Reputation: 208003
You can pipeline commands into Redis to avoid the round-trip times like this:
{ for ((i=0;i<10000000;i++)) ; do printf "set key$i $i\r\n"; done ; sleep 1; } | nc localhost 6379
That takes 80 seconds to set 10,000,000 keys.
Or, if you want to avoid creating all those processes for printf
, generate the data in a single awk
process:
awk 'BEGIN{for(i=0;i<10000000;i++){printf("set key%d %d\r\n",i,i)}}'; sleep 1; } | nc localhost 6379
That now takes 17 seconds to set 10,000,000 keys.
Upvotes: 1