Teja
Teja

Reputation: 13544

Get new users compared to yesterday

I am having a user traffic table where I need to get the gain/loss of new users compared to the previous day. Just wondering if there is a better way to do this rather than the below solution.

Schema :-

Table Strcutre: Session_ID, session_day, user_id, product_id

What I have tried?

SELECT session_day,
       session_count,
       user_count - LAG( user_count, 1 ) OVER ( ORDER BY session_day ) AS gain_loss_users
  FROM   
    (
        SELECT session_day,
               COUNT( session_id ) AS session_count,
               COUNT( user_id ) user_count
          FROM user_traffic
         GROUP BY session_day
     ) X ; 

Upvotes: 1

Views: 76

Answers (2)

Ponder Stibbons
Ponder Stibbons

Reputation: 14858

I tried to resolve "new" and "returning" persons problem. Here is my attempt:

    select session_day, 
       COUNT( distinct user_id ) AS user_cnt,
       count(distinct user_id) - lag(count(distinct user_id)) 
                                     over (order by session_day) gain,
       count(newu) AS  newu, count(returnu) AS returnu
  from (
          select session_id,
                 session_day,
                 user_id, 
                 CASE WHEN
                 count(*) over ( partition by user_id ORDER BY session_day,session_id ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW )
                           = 1 
                      THEN 1
                  END 
                  AS newu,
                 CASE WHEN 
                 lag( session_day,1 ) over ( partition by user_id ORDER BY session_day,session_id ) 
                           <> 
                           lag( session_day,1 ) over ( order by session_day,session_id ) 
                      THEN 1
                 END  AS returnu    
            from user_traffic u
        )
  group by session_day
  order by session_day;

Test data and output:

create table user_traffic (session_id number(6), session_day date, 
                           user_id number(6), product_id number(6));

insert into user_traffic values (  1, date '2016-09-07', 101, 1);
insert into user_traffic values (  2, date '2016-09-07', 101, 4);
insert into user_traffic values (  3, date '2016-09-07', 102, 1);
insert into user_traffic values (  4, date '2016-09-08', 101, 2);
insert into user_traffic values (  5, date '2016-09-08', 101, 4);
insert into user_traffic values (  6, date '2016-09-09', 102, 1);
insert into user_traffic values (  7, date '2016-09-10', 102, 1);
insert into user_traffic values (  8, date '2016-09-10', 103, 3);

SESSION_DAY        CNT       GAIN        NEW    RETURNS
----------- ---------- ---------- ---------- ----------
2016-09-07           2                     2          0   -- 101 & 102 are new
2016-09-08           1         -1          0          0
2016-09-09           1          0          0          1   -- 102 returned
2016-09-10           2          1          1          0   -- 103 is new

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270773

There is not a better way, but there is a more concise way. You can mix window functions with aggregation functions:

   SELECT session_day,
          COUNT(session_id ) as session_count,
          COUNT(DISTINCT user_id ) as user_count,
          (COUNT(DISTINCT user_id ) - 
           LAG(COUNT(DISTINCT user_id )) OVER (ORDER BY session_day)
          ) as gain_loss_users
      FROM user_traffic
     GROUP BY session_day;

I assume you want COUNT(DISTINCT) because (1) a user could have multiple sessions on the same day and (2) the two counts would be the same (if user_id and session_id are never NULL).

Upvotes: 0

Related Questions