Axis
Axis

Reputation: 2132

Need to calculate dynamic cumulative sum divided by max cohort per filter in Tableau

One of my goals was to find average user play days. For example, I need to measure how many days a player plays (number of logins that day) in 60 days (retention day). When a user joins the game that day, I need to count those users and divide them into their cohort per retention day. The following query calculates this.

WITH player_details AS (
               SELECT
                   app,
                   channel,
                   uid,
                   install_dt
               FROM users
               WHERE engaged AND NOT cheater
     ),
     player_daily_stats AS (
                   SELECT
                   app,
                   channel,
                   uid,
                   dt,
                   FROM users_daily
               ),
     aggregated_data AS (
                SELECT
                p.channel,
                install_dt,
                s.dt AS dt,
                DATE_DIFF(s.dt, install_dt, DAY) AS retention_day,
                COUNTIF(install_dt = s.dt AND DATE_DIFF(s.dt, install_dt, DAY) = 0) AS installs,
                COUNTIF(s.dt = install_dt + DATE_DIFF(s.dt, install_dt, DAY)) AS retained_users,
                COUNT(DISTINCT s.uid) AS active_days
                
                FROM player_details p
                     LEFT JOIN player_daily_stats s ON p.app = s.app AND p.channel = s.channel AND p.uid = s.uid
                WHERE install_dt = '2023-09-01'
                GROUP BY
                p.channel,
                install_dt,
                s.dt
            )
SELECT
    channel,
    install_dt,
    dt,
    retention_day,
    installs,
    retained_users,
    active_days,
    SUM(active_days) OVER (PARTITION BY install_dt, channel ORDER BY retention_day) AS sum_active_days,
    FIRST_VALUE(retained_users) OVER (PARTITION BY install_dt, channel ORDER BY retention_day) AS first_cohort_size,
    SUM(active_days) OVER (PARTITION BY install_dt, channel ORDER BY retention_day) / FIRST_VALUE(retained_users) OVER (PARTITION BY install_dt, channel ORDER BY retention_day)  ratio
FROM aggregated_data

Thus the ratio calculates the cumulative retained users and divides by the number of cohorts:

platform install_dt dt retention_day installs retained_users active_days sum_active_days first_cohort_size ratio
ANDROID 2023-09-01 2023-09-01 0 5459 5459 5459 5459 5459 1
ANDROID 2023-09-01 2023-09-02 1 0 2108 2108 7567 5459 1.38615131
ANDROID 2023-09-01 2023-09-03 2 0 1404 1404 8971 5459 1.643341271
ANDROID 2023-09-01 2023-09-04 3 0 984 984 9955 5459 1.823594065
ANDROID 2023-09-01 2023-09-05 4 0 852 852 10807 5459 1.979666606
ANDROID 2023-09-01 2023-09-06 5 0 727 727 11534 5459 2.11284118
ANDROID 2023-09-01 2023-09-07 6 0 658 658 12192 5459 2.233376076
ANDROID 2023-09-01 2023-09-08 7 0 638 638 12830 5459 2.350247298
ANDROID 2023-09-01 2023-09-09 8 0 616 616 13446 5459 2.463088478
ANDROID 2023-09-01 2023-09-10 9 0 552 552 13998 5459 2.564205899
ANDROID 2023-09-01 2023-09-11 10 0 495 495 14493 5459 2.654881846
ANDROID 2023-09-01 2023-09-12 11 0 481 481 14974 5459 2.742993222
ANDROID 2023-09-01 2023-09-13 12 0 477 477 15451 5459 2.830371863
ANDROID 2023-09-01 2023-09-14 13 0 436 436 15887 5459 2.910239971
ANDROID 2023-09-01 2023-09-15 14 0 447 447 16334 5459 2.992123099
ANDROID 2023-09-01 2023-09-16 15 0 431 431 16765 5459 3.071075289
ANDROID 2023-09-01 2023-09-17 16 0 398 398 17163 5459 3.143982414
ANDROID 2023-09-01 2023-09-18 17 0 358 358 17521 5459 3.209562191
ANDROID 2023-09-01 2023-09-19 18 0 360 360 17881 5459 3.275508335
ANDROID 2023-09-01 2023-09-20 19 0 343 343 18224 5459 3.338340355

But now I have an extra goal to find user average play days between certain retention day. So an example I need to find user average play days between retention_day 3 to 7 and for this I need to calculate cumulative retained_users from retention_day 3 and divide cohort retention_day 3 (or max retention day after retention_day 3)

The result should be 3859/984 from the example data

Can I do in Tableau?

Upvotes: 0

Views: 102

Answers (1)

vizyourdata
vizyourdata

Reputation: 1444

The simple answer is yes. I put this together with a combination of window_sum, parameter start and end days and some calculated fields.

Parameter setup:

p_retention_day_start

p_retention_day_end

Filter to your selected retained day range

Create a calculated field named _show_retention_range

[Retention Day] >= [p_retention_day_start] AND [Retention Day] <= [p_retention_day_end]

Your Numerator (3859)##

Create a window sum function to calculate the retained users within your retention_day range. I named it _window_sum_retained_users

WINDOW_SUM(SUM([Retained Users]))

Your Denominator (984)

Create a FIXED{} calc and call it _fixed_cohort_start_retained_users

IF [Retention Day] = [p_retention_day_start] THEN { FIXED [Retention Day] : SUM([Retained Users]) } END

Calculated field to create the final value

[_window_sum_retained_users] / SUM([_fixed_cohort_start_retained_users])

create a worksheet and arrange the elements like this:

final worksheet

Hope this helps!

Upvotes: 2

Related Questions