Sushan
Sushan

Reputation: 87

How to find total sessions played in BigQuery?

How to find out the total number of sessions played by all users in a month time frame. The event user_engagement has a parameter session count which increments on each session. The issue being the user who plays 10 sessions would have session count 1 to 10. So how am I supposed to add only the max session count i.e 10 in this instance and similarly for all users.

SELECT
  SUM(session_count) AS total_sessions,
  COUNT(DISTINCT user_pseudo_id) AS users
FROM
  `xyz.analytics_111.events_*`

WHERE
  event_name = "user_engagement" AND (_TABLE_SUFFIX BETWEEN "20200201" AND "20200229")
  AND platform = "ANDROID"

Upvotes: 0

Views: 1516

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269803

I am unclear on what your data looks like. If there is one row per session, then you can simply use:

SELECT COUNT(*) AS total_sessions,
       COUNT(DISTINCT user_pseudo_id) AS users
. . .

If you can have multiple events per session, you can use a hacky approach:

SELECT COUNT(DISTINCT CONCAT(user_pseudo_id, ':', CAST(session_count as string)))

I offer this, because sometimes in a complex query, it is simpler to just tweak a single row. Otherwise, Mikhail's solution is reasonable.

However, I would suggest window functions instead:

SELECT SUM(CASE WHEN seqnum = 1 THEN session_count END) AS total_sessions,
       COUNT(DISTINCT user_pseudo_id) AS users
FROM (SELECT e.*,
             ROW_NUMBER() OVER (PARTITION BY user_pseudo_id ORDER BY session_count DESC) as seqnum
      FROM `xyz.analytics_111.events_*`
      WHERE e.event_name = 'user_engagement' AND
            _TABLE_SUFFIX BETWEEN '20200201' AND '20200229' AND
            platform = 'ANDROID'
    ) e;

The reason I recommend this is because you can keep the rest of the calculation without changing them. That is handy in a complex query.

Upvotes: 1

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172993

Try below (BigQuery Standard SQL)

#standardSQL
SELECT
  SUM(session_count) AS total_sessions,
  COUNT(user_pseudo_id) AS users
FROM (
  SELECT user_pseudo_id, MAX(session_count) session_count
  FROM `xyz.analytics_111.events_*`
  WHERE event_name = "user_engagement" 
  AND _TABLE_SUFFIX BETWEEN "20200201" AND "20200229"
  AND platform = "ANDROID"
  GROUP BY user_pseudo_id
)

Upvotes: 1

Related Questions