Reputation: 4108

archiving some redis data to disk

I have been using redis a lot lately, and really am loving it. I am mostly familiar with persistence (rdb and aof). I do have one concern. I would like to be able to selectively "archive" some of my data to disk (or cheaper storage) once it is no longer important. I don't really want to delete it because it might be valuable at some point.

All of my keys are named id_<id>_<someattribute>. So when I am done with id 4, I want to "archive" all all keys that match id_4_*. I can view them quite easily in with the command line, but I can't do anything with them, persay. I have quite a bit of data (very large bitmaps) associated with this data set, and frankly I can't afford the space once the id is no longer relevant or important.

If this were mysql, I would have my different tables and would very easily just dump it to a .sql file and then drop the table. The actual .sql file isn't directly useful to me, but I could reimport the data if/when I need it. Or maybe I have to mysql database and I want to move one table to another database. Are there redis corollaries to these processes? Is there someway to make an rdb or aof file that is a subset of the data?

Any help or input on this matter would be appreciated! Thanks!

Upvotes: 2

Answers (3)

Landon

Reputation: 4108

@Hoseong Hwang recently asked what I did, so I'm posting what I ended up doing.

It was really quite simple, actually. I was benefited by the fact that my key space is segmented out by different users. All of my keys were of the structure user_<USERID>_<OTHERVALUES>. My archival needs were on a user basis, some user's data was no longer needed to be kept in redis.

So, I started up another instance of redis-server, on another port locally (6380?) or another machine, it makes no difference. Then, I wrote a short script that basically just called KEYS user_<USERID>_* (I understand the blocking nature of KEYS, my key space is so small it didn't matter, you can use SCAN if that is an issue for you.) Then, for each key, I MIGRATED them to that new redis-server instance. After they were all done. I did a SAVE to ensure that the rdb file for that instance was up to date. And now I have that rdb, which is just the content that I wanted to archive. I then terminated that temporary redis-server and the memory was reclaimed.

Now, keep that rdb file somewhere for cheap, safe keeping. And if you ever needed it again, doing the reverse of my process above to get those keys back into your main redis-server would be fairly straightforward.

Upvotes: 3

Didier Spezia

Reputation: 73216

Instead of trying to extract data from a live Redis instance for archiving purpose, my suggestion would be to extract the data from a dump file.

Run a bgsave command to generate a dump, and then use redis-rdb-tools to extract the keys you are interested in - you can easily get the result as a json file.

See https://github.com/sripathikrishnan/redis-rdb-tools

You can keep the json data in flat files, or try to store them into a relational database or a document store if you need them to be indexed for retrieval purpose.

Upvotes: 1

raffian

Reputation: 32056

A few suggestions for you...

I would like to be able to selectively "archive" some of my data to disk (or cheaper storage) once it is no longer important. I don't really want to delete it because it might be valuable at some point.

If such data is that valuable, use a traditional database for storage. Despite redis supporting snap-shotting to disk and AOF logs, you should view it as mostly volatile storage. The primary use case for redis is reducing latency, not persistence of valuable data.

So when I am done with id 4, I want to "archive" all all keys that match id_4_*

What constitutes done? You need to ask yourself this question; does it mean after 1 day the data can fall out of redis? If so, just use TTL and expiration to let redis remove the object from memory. If you need it again, fall back to the database and pull the object back into redis. That first client will take the hit of pulling from the db, but subsequent requests will be cached. If done means something not associated with a specific duration, then you'll have to remove items from redis manually to conserve memory space.

If this were mysql, I would have my different tables and would very easily just dump it to a .sql file and then drop the table. The actual .sql file isn't directly useful to me, but I could reimport the data if/when I need it.

We do the same at my firm. Important data is imported into redis from rdbms executed as on-demand job. We don't drop tables, we just selectively import data from the database into redis; nothing wrong with that.

Is there someway to make an rdb or aof file that is a subset of the data?

I don't believe there is a way to do selective archiving; it's either all or none.

IMO, spend more time playing with redis. I highly recommend leveraging out-of-box features instead of reinventing and/or over-engineering solutions to suit your needs.

Hope that helps!...

Upvotes: 0

archiving some redis data to disk

Answers (3)

Related Questions