Reputation: 10266
I'm storing strings on the order of 150M. It's well-below the maximum size of strings in Redis, but I'm seeing a lot of different, conflicted opinions on the approach I should take, and no clear path.
On the one hand, I've seen that I should use a hash with small data chunks, and on the other hand, I've been told that leads to gapping, and that storing the whole string is most efficient.
On the one hand, I've seen that I could pass in the one massive string, or do a bunch of string-append operations to build it up. The latter seems like it might be more efficient than the former.
I'm reading the data from elsewhere, so I'd rather not fill a local, physical file just so that I can pass a whole string. Obviously, it'd be better all around if I can chunk the input data, and feed it into Redis via appends. However, if this isn't efficient with Redis, it might take forever to feed all of the data, one chunk at a time. I'd try it, but I lack the experience, and it might be inefficient for a number of different reasons.
That being said, there's a lot of talk of "small" strings and "large" strings, but it's not clear what Redis considers an optimally "small" string. 512K, 1M, 8M?
Does anyone have any definitive remarks?
I'd love it if I could just provide a file-like object or generator to redis-py, but that's more language-specific than I meant this question to be, and most likely impossible for the protocol, anyway: it'd just require internal chunking of the data, anyway, when it's probably better to just impose this on the developer.
Upvotes: 4
Views: 3418
Reputation: 44112
One option would be:
pipeline
contenxt manager to ensure, you are the only one, who writes at that moment.Alternative approach, also using list, would be to invent random list name, write content chunk by chunk into it, and when you are done, update value in known key in Redis pointing to this randomly named list. Do not forget to remove old one, this can be done from your code, but you might use expiration if it seems usable in your use case.
Upvotes: 2