G.D. Singh
G.D. Singh

Reputation: 173

How to model time series data in cassandra when data has non-uniform generation rate?

I am planning to migrate data from my existing database (Postgres) to Cassandra. Here is a brief overview of the system:

I am trying to model this data using few different approaches.

I have tried using weekly buckets instead of monthly buckets and pagination to improve on other parameters. But this is something i am not able to sort out Partition size is non-uniform because of different rate of data from 3 different sources.

How can i keep partition size consistent (almost) in such a data model? Ideas are welcome.

Upvotes: 1

Views: 207

Answers (1)

Firdousi Farozan
Firdousi Farozan

Reputation: 882

This is a classical problem and there are no easy solutions to make partition size uniform. If you can predict the rate of ingestion per user, probably you can have different buckets of users, such as, high, medium and low ingestion users.

Depending on the type, the time bucket would be different. For a high ingestion user, partition means a day and for a low ingestion user, partition means a month.

For speeding up your month query on a high ingestion user, you can run parallel queries of 30 days and see if it helps to optimize your query time.

Upvotes: 1

Related Questions