Reputation: 7475
I'm trying to figure out how to write my SQL query to get users day to day and retention. consider having the following row table round_statistics on each play round i have date of the round, now i would like to: 1. know how many users play two days in a row meaning played on Sunday and Monday, Monday and Tuesday, but Sunday and Tuesday doesn't count as two days in a row. 2. users retention 1-7
retention 7 is : % of users that have the chance to play the last 7 days (meaning they are registered at least 7 days) and had some activity (record) after 7 days.
retention 6-1 are the same only for 6-1 days.
Please help me to find out my game retention :) you will get a free coins to play it.... Thanks.
The table structure is: user_id,round_time
for example if i played 3 times today:
user id | round_time
1000, | '2013-08-10 14:02:53'
1000, | '2013-08-10 14:03:25'
1000, | '2013-08-10 14:04:47'
the result structure is:
date | 2013-08-10 | 2013-07-10
day to day | 10 | 100
retention 7 | 15 | 125
retention 6 | 20 | 210
retention 5 | 30 | 320
retention 4 | 40 | 430
retention 3 | 50 | 540
retention 2 | 60 | 650
retention 1 | 120 | 1620
Upvotes: 1
Views: 4104
Reputation: 51655
My sql don't has analytic functions, neither CTE and pivot table features, for this reasons it is not direct to do your required query (and nobody answer your question).
For this data:
create table t ( uid int, rt date);
insert into t values
(99, '2013-08-7 14:02:53' ), <- gap
(99, '2013-08-9 14:02:53' ), <-
(99, '2013-08-10 14:03:25' ),
(1000, '2013-08-7 14:02:53' ),
(1000, '2013-08-8 14:03:25' ),
(1000, '2013-08-9 14:03:25' ),
(1000, '2013-08-10 14:04:47');
This is an approach before pivot retentions, for a given date ( '2013-08-10 00:00:00' , '%Y-%m-%d')
:
select count( distinct uid ) as n, d, dt from
(
select uid,
'2013-08-10 00:00:00' as d,
G.dt
from
t
inner join
( select 7 as dt union all
select 6 union all select 5 union all
select 4 union all select 3 union all
select 2 union all select 1 union all select 0) G
on DATE_FORMAT( t.rt, '%Y-%m-%d') between
DATE_FORMAT( date_add( '2013-08-10 00:00:00', Interval -1 * G.dt DAY) ,
'%Y-%m-%d')
and
DATE_FORMAT( '2013-08-10 00:00:00' , '%Y-%m-%d')
where DATE_FORMAT(rt , '%Y-%m-%d') <= DATE_FORMAT( '2013-08-10 00:00:00' ,
'%Y-%m-%d')
group by uid, G.dt
having count( distinct DATE_FORMAT( T.rt, '%Y-%m-%d') ) = G.dt + 1
) TT
group by dt
Your pre-cooked data ( DT = 0 means today visits, DT = 1 means 2 consecutive days, ...):
| N | D | DT |
--------------------------------
| 2 | 2013-08-10 00:00:00 | 0 |
| 2 | 2013-08-10 00:00:00 | 1 |
| 1 | 2013-08-10 00:00:00 | 2 |
| 1 | 2013-08-10 00:00:00 | 3 |
Here it is ( for same data ):
select count( distinct uid ) as n, d, dt from
(
select uid,
z.zt as d,
G.dt
from
t
cross join
( select distinct DATE_FORMAT( t.rt, '%Y-%m-%d') as zt from t) z
inner join
( select 7 as dt union all
select 6 union all select 5 union all
select 4 union all select 3 union all
select 2 union all select 1 union all select 0) G
on DATE_FORMAT( t.rt, '%Y-%m-%d') between
DATE_FORMAT( date_add( z.zt, Interval -1 * G.dt DAY) ,
'%Y-%m-%d')
and
z.zt
where z.zt <= z.zt
group by uid, G.dt, z.zt
having count( distinct DATE_FORMAT( T.rt, '%Y-%m-%d') ) = G.dt + 1
) TT
group by d,dt
order by d,dt
Results at sqlfiddle: http://sqlfiddle.com/#!2/c26ec/10/0
| N | D | DT | GROUP_CONCAT( UID) |
--------------------------------------------
| 2 | 2013-08-07 | 0 | 1000,99 |
| 1 | 2013-08-08 | 0 | 1000 |
| 1 | 2013-08-08 | 1 | 1000 |
| 2 | 2013-08-09 | 0 | 1000,99 |
| 1 | 2013-08-09 | 1 | 1000 |
| 1 | 2013-08-09 | 2 | 1000 |
| 2 | 2013-08-10 | 0 | 1000,99 |
| 2 | 2013-08-10 | 1 | 99,1000 |
| 1 | 2013-08-10 | 2 | 1000 |
| 1 | 2013-08-10 | 3 | 1000 |
Upvotes: 2