Alkalyne
Alkalyne

Reputation: 153

How can I do this in SQL?

Today, I need your help.

I have a stats website, I get data from Game Webservices. I want to implement a new function but I don't know how.

I want to guess players' connection hours.

I have a script which collects data every hour and stores this data in a table. Imagine that I have a table with: player_id, score and the hour (Integer, just H), and the day number of the month.

Then, for example, if the score between hour 17 and 18 is different then player has been connected to his account.

To simplify, imagine that I have a table with day from 1 to 31 and hour from 0 to 23 for every day.

At the end of the month I need to execute a query to calculate for each hour, the number of days the player has been connected during this hour.

Example :

0 => 31     The player has been connected between 23 and 0  : every days
1 => 3      The player has been connected between 0 and 1  : 3 days a month
2 => 5      The player has been connected between 1 and 2  : 5 days a month
3 => 10     The player has been connected between 3 and 4  : 10 days a month
...
23 => 4

I think I can ORDER BY days and hour and player_id from day 1 hour 0 to day 31 hour 23 And do a first SELECT with a CASE like :

SELECT 
    table.*,  
   (CASE WHEN ACTUAL_ROW.score!=PREVIOUS_ROW.score THEN 1 ELSE 0) AS active
FROM table

TO know for each row if the player has been connected. AND THEN It's Simple to do a GROUP BY and a SUM for each hour. But I don't know how I can compare previous row with actual

Do you have any IDEA or hint how to do this ? Is PL/SQL Better to do this ?

Note :I'm using PostGreSQL

Thanks

Upvotes: 2

Views: 158

Answers (2)

LSerni
LSerni

Reputation: 57418

The problem here is that we're not checking when the player "was connected" but instead when the player "earned points", which can be similar - or not; and this at intervals of one hour (three logins in one hour count as one). Just as well, a player remaining logged three hours and accruing points in that period will result as being "logged" in one, two or three data points, depending.

With those caveats, we can JOIN the score table with itself:

SELECT a.player_id, a.day, a.hour, a.score - b.score AS chg
   FROM cdata AS a
   JOIN cdata AS b
   ON (
        (a.player_id = b.player_id AND a.score != b.score)
   AND (
        (a.hour > 0 AND a.day = b.day   AND b.hour = a.hour-1)
        OR
        (a.hour = 0 AND a.day = b.day+1 AND b.hour = 23)
       )
   )

This will yield a series of statistics for the user, with the day and hour when his score changed.

You can use this in a collecting subSELECT

SELECT player_id, hour, COUNT(player_id) FROM ( ... ) AS changes
    GROUP BY player_id, hour
    ORDER BY player_id, hour;

and this will return in 'changes' a number between 1 and 31. Hours with no logins will not be counted.

I have attempted to provide a test case with this SQLFiddle. The above is not PostgreSQL specific, you can optimize the inner query using PostgreSQL window functions.

Upvotes: 0

Ihor Romanchenko
Ihor Romanchenko

Reputation: 28591

You can access the previous row of the table with LAG window function.

Try using something like

SELECT player_id, count(CASE WHEN score > prev_score THEN 1 END)
FROM(
SELECT player_id, score, mm, hh, LAG(score) OVER (ORDER BY mm,hh) as prev_score
FROM your_table)
GROUP BY player_id

Additional advise - store full timestamps instead of day and hour fields. You can always get the day and hour from timestamp with functions.

Manual on window functions: one, two

Upvotes: 3

Related Questions