Reputation: 865
I am fairly new to Cassandra and have been reading a lot over the last month.
I am working on a small use case.
Query: Top X Players based on AmountPlayed in a TimeRange.
So at any given time range i would want to aggreagate players TotalAmountPlayed and derive Top X Players.
I followed the approach of creating UDF (using C*-2.2.0 version) for aggregation of AmountPlayed by a Player.
Below is my Time Series data model i designed for this use case.
CREATE COLUMNFAMILY PlayerRating
(
PlayerNumber int, ===> Unique account number
GameID text, ===> unique machine ID per slot
AmountPlayed double, ===> AmountPlayed by the player
EventTime timestamp, ===> Event generated TimeStamp
PRIMARY KEY ((PlayerNumber, GameID),EventTime)) WITH CLUSTERING ORDER BY(EventTime desc);
Please let me know if my Data Model design is proper for My Query.
Thank You !!
Upvotes: 2
Views: 140
Reputation: 9475
I think it might be easier to put all the players for each game into a single partition.
That way you could aggregate all the players with one query instead of making a separate query for each player. Then you could aggregate the playing time for each player into a map (see an example of how to define the UDF's for that here).
So your table would look something like this:
CREATE TABLE playing_time_by_game (game_id text, event_time int, player_id text, amount_played int, PRIMARY KEY (game_id, event_time));
Then create the UDF to total by player_id:
CREATE FUNCTION state_group_and_total( state map<text, int>, type text, amount int )
CALLED ON NULL INPUT
RETURNS map<text, int>
LANGUAGE java AS '
Integer count = (Integer) state.get(type); if (count == null) count = amount; else count = count + amount; state.put(type, count); return state; ' ;
Then create the aggregate function:
CREATE OR REPLACE AGGREGATE group_and_total(text, int)
SFUNC state_group_and_total
STYPE map<text, int>
INITCOND {};
Then insert some data:
SELECT * from playing_time_by_game ;
game_id | event_time | amount_played | player_id
---------+------------+---------------+-----------
game1 | 0 | 8 | player1
game1 | 1 | 12 | player2
game1 | 5 | 1 | player2
game1 | 8 | 50 | player1
game2 | 0 | 200 | player1
Now you can aggregate by player_id:
SELECT group_and_total(player_id, amount_played) from playing_time_by_game;
t2.group_and_total(player_id, amount_played)
----------------------------------------------
{'player1': 258, 'player2': 13}
And you can restrict the query to the game partition and time range:
SELECT group_and_total(player_id, amount_played) from playing_time_by_game where game_id='game1' and event_time >=0 and event_time <=7;
t2.group_and_total(player_id, amount_played)
----------------------------------------------
{'player1': 8, 'player2': 13}
You could probably also define a FINALFUNC to sort and keep just the top ten items in the map. See this.
Upvotes: 1