Henrik
Henrik

Reputation: 169

Allow filtering function in Cassandra (which choice is correct?)

I am currently trying to model some time series data in base of Cassandra. For example i have a table bigint_table, which was created by following query

**

CREATE TABLE bigint_table (name_id int,tuuid timeuuid, timestamp timestamp, value text, PRIMARY KEY ((name_id),tuuid, timestamp)) WITH CLUSTERING ORDER BY (tuuid asc, timestamp asc)

** tuuid column was added because without it I had problems and I lost some data while inserting them in DB. name_id represents the channel's ID data comes from.tuuid column was added because without it I had problems and I lost some data while inserting them in DB. In one table there are lots of data with the same ID, but they are unique by timestamp and tuuid (values also can be the same sometimes). I consistently execute 2 different queries to get values and timestamps

  1. select value from bigint_table where name_id=6 and timestamp>' 2017-11-01 8:26:47.970+0000' and timestamp<'2017-11-30 8:26:52.048+0000' order by tuuid asc, timestamp asc allow filtering

2.

select timestamp from bigint_table where name_id=6 and timestamp>' 2017-11-01 8:26:47.970+0000' and timestamp<'2017-11-30 8:26:52.048+0000' order by tuuid asc, timestamp asc allow filtering

In this post author says one need to resist the urge to just add ALLOW FILTERING to itand one should think about data, model and what one is trying to do.

I thought a lot about using ALLOW FILTERING function or not, and I figured out that I have no choice in my case and I need to use it. But those words in post I mentioned above are keeping me in doubt. I would like to know your advise and what do you thnik about my problem. Is there another way to model my data tables, queries of which do not require ALLOW FILTERING? I would be very very thank you for advice.

Upvotes: 0

Views: 794

Answers (1)

Simon Fontana Oscarsson
Simon Fontana Oscarsson

Reputation: 2124

The reason you need allow filtering is because you have the clustering column (tuuid, timestamp)in the wrong order. In this case the data stored first by tuuid and then by timestamp.But you're actually choosing data by timestamp and then ordering by tuuid so Cassandra can't use the indexes that you have specified. The order when you define the primary key matters.

Upvotes: 0

Related Questions