Reputation: 4060
Am doing bulk data processing and for speed's sake am first storing the data in redis before dumping it into a postgresql database in 2 minute intervals. I've used a redis hash to store the data and even the hash keys in redis correspond to the columns in the database tables.
Am using redis.scan()
to get the list of redis hashes storing the rows of data and then using redis.hgetall()
to get the data in the hashes. From there I create an SQL Insert statement in SqlAlchemy and perform a bulk data insert into the database.
The issues I face is that I have to first extract the keys containing my data using SCAN:
redis_match = 'data:row_keys:*'
row_keys = rdb.scan_iter(match=redis_match, count=limit_no)
From there I fetch all data in each hash:
for index, row_id in enumerate(row_keys):
row_data = rdb.hgetall(row_id)
The row_data is in the form of key:value
but its stored in byte
form so am incurring an extra overhead to decode each key and value manually using:
for key, value in row_data.items():
key = ast.literal_eval(key.decode('unicode_escape'))
value = ast.literal_eval(value.decode('unicode_escape'))
I feel this is too much and there must be a more elegant way to:
Edit:
I've read about postgresql's Foreign Data Wrappers and especially redis_fdw and am wondering whether its the one to resolve my situation of getting a faster way to move data from Redis to Postgresql with the least trouble possible
Upvotes: 1
Views: 2311
Reputation: 7743
The redis_fdw
is the way to go. Just keep in mind that each member of your hash set will not be a different row in the corresponding Pg foreign table. It will instead create in the foreign table a single row per Redis hash and use a Pg array for all of the hash's values.
E.g. for the following hash in Redis:
myhash = {a:1, b:2}
you could create the foreign table:
CREATE FOREIGN TABLE my_pg_hash (key text, val text[])
SERVER redis_server
OPTIONS (database '0', tabletype 'hash', tablekeyprefix 'myhash');
Foreign table my_pg_hash
will contain a single row for the entire Redis hash set myhash
. This row will have as key myhash
and as value a postgres array with all of the key/value pairs of your redis hash.
SELECT * FROM my_pg_hash;
key | val
----------+-----------
myhash | {a,1,b,2}
(1 row)
You can split the val array into separate rows using Pg's unnest() function:
SELECT key, unnest(val) FROM my_pg_hash;
key | unnest
--------+--------
myhash | a
myhash | 1
myhash | b
myhash | 2
(4 rows)
Upvotes: 1