Reputation: 85
I just begin study cassandra.
It was a table and queries.
CREATE TABLE finance.tickdata(
id_symbol int,
ts timestamp,
bid double,
ask double,
PRIMARY KEY(id_symbol,ts)
);
And query is successful,
select ts,ask,bid
from finance.tickdata
where id_symbol=3
order by ts desc;
Next it was decision move id_symbol in table name, new table(s) scripts.
CREATE TABLE IF NOT EXISTS mts_src.ticks_3(
ts timestamp PRIMARY KEY,
bid double,
ask double
);
And now query fails,
select * from mts_src.ticks_3 order by ts desc
I read from docs, that I need use and filter (WHERE) by primary key (partition key), but technically my both examples same. Why cassandra so restricted in this aspect?
And one more question, It is good idea in general? move id_symbol in table name - potentially it can be 1000 of unique id_symbol and a lot of data for each. Separate this data on individual tables look like good idea!? But I lose order by possibility, that is so necessary for me to take fresh data by each symbol_id.
Thanks.
Upvotes: 0
Views: 152
Reputation: 87109
You can't sort on the partition key, you can sort only on clustering columns inside the single partition. So you need to model your data accordingly. But you need to be very careful not to create very large partitions (when using ticker_id
as partition key, for example). In this case you may need to create a composite keys, like, ticker_id
+ year, or month, depending on how often you're inserting the data.
Regarding the table per ticker, that's not very good idea, because every table has overhead, it will lead to increased resource consumption. 200 tables is already high number, and 500 is almost "hard limit"
Upvotes: 2