chidori
chidori

Reputation: 1112

efficient way to store 100GB dataset

I have 100GB dataset in this format with row format as seen below.

cookie,iplong1,iplong2..,iplongN

I am currently trying to fit this data into redis as a sorted set data structure. I would also need to set a TTL for each of those IPs. I was thinking to have TTL implemented on each element in that set, I will probably add a score to them, where score is the epoch time. And may be I will write a separate script to parse the scores and remove expired IPs based score as applicable. With that said, I am also noticing that it almost takes 100GB memory to this 100GB dataset. I was wondering if there is any other way of efficiently packing this data in redis with minimal memory footprint.

I am also happy to know if there are any other tech stack out there that can handle this better. This dataset would be updated frequently based on hourly logs, also the expectation is we should be able to read from it at faster rate, concurrently.

Thanks in advance.

Upvotes: 0

Views: 192

Answers (0)

Related Questions