zipp
zipp

Reputation: 1126

MySQL conditional self join

Table

CREATE TABLE dialoglog (
   userid int,
   appid int,
   type varchar(100) ,
   ds timestamp
 );

Data

userid  appid   type    ds
1        1      imp     January, 19 2016 03:14:07
1        1      click   January, 20 2016 03:14:07
2        1      imp     January, 19 2016 03:14:07
2        1      click   January, 18 2016 03:14:07
3        6       imp    January, 19 2016 03:14:07
10       1      click   January, 19 2016 03:14:07
1        1       imp    January, 20 2016 03:12:07

Question

What is the CTR (click/imp) per appid ?

Problem

We want to print the CTR per group

Problem is that we have "bad" data:

Therefore, a query like this:

SELECT SUM(CASE WHEN type='click' THEN 1 else 0 END)*1.0/SUM(CASE WHEN dialoglog.type='imp' THEN 1 else 0 END) as ctr,appid 
     FROM  dialoglog GROUP BY  appid;

Would not work because of the constrain I just said. Is it possible to code all these constrain in a single mysql query?

Here is sqlfiddle to play with

Result expected

For appid 1 we have 3 impression and 1 click (that is valid since for user 2 timestamp clicked < timestamp impression and for user 10 there was no impression) therefore we should have a CTR of 1/3.

For appid 6, we only have one impression => CTR = 0

Therefore I am expecting this result:

appid  CTR
1      0.333
6      0

What I tried so far

I thought about doing left join but the problem is that it will duplicate the right table:

SELECT * FROM dialoglog t1 LEFT JOIN dialoglog t2 on t1.userid=t2.userid and t1.appid=t2.appid and t1.type<>t2.type WHERE t1.type="imp" and t1.ds<t2.ds;



userid  appid   type    ds                   userid appid   type    ds
1        1      imp January, 19 2016 03:14:07   1    1    click January, 20 2016 03:14:07
1        1      imp January, 19 2016 03:12:07   1    1    click January, 20 2016 03:14:07

Upvotes: 2

Views: 221

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1271231

You seem to need the earliest "imp" for each user. Then you can aggregate by the transactions after that point in time:

SELECT dl.appid, SUM(dl.type = 'click') / SUM(dl.type = 'imp')
FROM dialoglog dl LEFT JOIN
     (SELECT dl2.userId, MIN(ds) as imp_ds
      FROM dialoglog dl2
      WHERE type = 'imp'
      GROUP BY dl2.userid
     ) dlimp
     ON dl.userid = dlimp.userid 
WHERE dl.ds >= dlimp.imp_ds
GROUP BY dl.appid;

Upvotes: 1

Related Questions