Reputation: 3338
I have a website where users can submit text messages, dead simple data structure...
On the previous version of the website they are stored in MySQL database which is very big, lots of tables, and am wanting to simplify the database. So I heard Redis is good for simple data structures and non relational information...
Would Redis be a good option for this kind of data and how would it perform, with memory usage and read times when talking about 100,000+ records a year...
Upvotes: 0
Views: 1845
Reputation: 462
redis is really only good for in-memory problem sets. It DOES have a page-to-disk capability - but then you're at the mercy of the OS swapper - namely you're RAM will be in competition with system-caches. Also, I think the keys always have to fit in RAM. So you're NOT going to want to store 1G+ log records - mysql-archive-table is MUCH better for that.
redis has a master-slave functionality, similar to mysql. So you can perform various tricks such as sorting on a slave to keep the master responsive. While I haven't used it, I'd speculate that for in-memory databases, mysql-cluster is probably far more advanced - but with corresponding extra complexity / resource-costs.
If you have large values for your key-value set, you can perform client-side compression/decompression. There isn't much the server can do to search on the values of those 'blobs' anyway.
One common way to get around the RAM limitation is to perform client-side sharding (partitioning). Namely, if you KNOW your upper bounds, and you don't have enough RAM to throw at the problem for some reason (say you already have 64GB of RAM), then you could 'shard' based on the primary key.. If it's a sequence counter, you could take the bottom 3 bits (or some hashing function + partition function), and distribute amongst 4,8,16, etc server nodes. That scales linearly, though if you need to re-partition, that could be painful. You COULD take advantage of the 'slots' in redis to start off with fewer machines.. Say 1 machine with 16 slots.. Then later, dump slots 7-15 and restore on a different machine and remap all the clients to point to the two machines (with the same slot numbers). And so forth to 16-way sharding. At which point, you'd need to remap ALL your data to go to 32-way.
Obviously first evaluate the command-set of redis to see if ALL your data-storage and reporting needs can be met. There are equivalents to "select * from foo for update", but they're not obvious. Not all RDBMS queries can be reproduced efficiently with key-value stores. But for simple natural-primary-key record-structures it should do fine.
Additionally, it should be easy to extend the redis command-set to perform custom operations.. Just keep in mind, it's designed around no-pause single-threaded execution (avoids locking /context-switching overhead).
But things I really like are the FIFOs, pub/sub, data-time-outs, atomic-mutations (inc/dec), lazy-sorting (e.g. on client with read-only nodes), maps of maps. It's simple enough that instead of using name-spaces, you just launch separate redis processes on different ports / UNIX-sockets (my preference if possible).
It's meant to replace memcached more than anything else, but has a very nice background persistent framework.
Upvotes: 3