Aleksey Yakushev
Aleksey Yakushev

Reputation: 85

Cassandra order by timestemp desc

I just begin study cassandra.

It was a table and queries.

    CREATE TABLE finance.tickdata(
    id_symbol  int, 
    ts         timestamp,
    bid        double,
    ask        double,
    PRIMARY KEY(id_symbol,ts)
);

And query is successful,

select ts,ask,bid 
  from finance.tickdata 
 where id_symbol=3 
 order by ts desc;

Next it was decision move id_symbol in table name, new table(s) scripts.

CREATE TABLE IF NOT EXISTS mts_src.ticks_3(
    ts         timestamp PRIMARY KEY,
    bid        double,
    ask        double
);

And now query fails,

select * from mts_src.ticks_3 order by ts desc

I read from docs, that I need use and filter (WHERE) by primary key (partition key), but technically my both examples same. Why cassandra so restricted in this aspect?

And one more question, It is good idea in general? move id_symbol in table name - potentially it can be 1000 of unique id_symbol and a lot of data for each. Separate this data on individual tables look like good idea!? But I lose order by possibility, that is so necessary for me to take fresh data by each symbol_id.

Thanks.

Upvotes: 0

Views: 152

Answers (1)

Alex Ott
Alex Ott

Reputation: 87109

You can't sort on the partition key, you can sort only on clustering columns inside the single partition. So you need to model your data accordingly. But you need to be very careful not to create very large partitions (when using ticker_id as partition key, for example). In this case you may need to create a composite keys, like, ticker_id + year, or month, depending on how often you're inserting the data.

Regarding the table per ticker, that's not very good idea, because every table has overhead, it will lead to increased resource consumption. 200 tables is already high number, and 500 is almost "hard limit"

Upvotes: 2

Related Questions