Reputation: 2888
If I define a Cassandra table to store timeseries
CREATE TABLE series (
... series_id INT,
... time TIMESTAMP,
... value DOUBLE,
... PRIMARY KEY (series_id, time)
... ) WITH CLUSTERING ORDER BY (time DESC);
And in Java, I use the driver to query for some timeseries
com.datastax.driver.core.ResultSet results =
session.execute(
"SELECT * FROM series WHERE seriesid IN (1, 2)";
It's going to give me a list of rows in which each row is a data point of one of these two series. However the series id is all the same (1 or 2) for these rows. Is it possible to make the payload more efficient by returning two rows, one for the 1 series and one for the 2 series, each has a variable set of columns, one column for each datapoint?
Upvotes: 0
Views: 143
Reputation: 101
What you are trying to do is not possible using CQL3. Cassandra transposes the result view to shield user from having to deal with the underlying storage format.
But using Datastax drivers and mapping APIs you can achieve the required effect by using the below code :
List<Integer> ids = Arrays.asList(1, 2, 3, 4);
Map<Integer, ResultSetFuture> futures = new HashMap<>();
Map<Integer, List<DataPoint>> requiredMap = new HashMap<>();
MappingManager manager = new MappingManager(session);
/*
* DataPoint has two attributes "private Date time;" and "private double value;"
*/
Mapper<DataPoint> mapper = manager.mapper(DataPoint.class);
for (Integer id : ids) {
futures.put(id, (session.executeAsync("SELECT time, value FROM series WHERE seriesid = " + id)));
}
for(Integer id : futures.keySet()){
ResultSet result = futures.get(id).getUninterruptibly();
Result<DataPoint> dataPoints = mapper.map(result);
requiredMap.put(id, dataPoints.all());
}
Also, please note that using "IN" is considered an anti-pattern in cassandra and should be avoided, hence I used looping with async executions to get the same effect. You can read more about this here
Upvotes: 1