Greenleaf
Greenleaf

Reputation: 3

Filtering sessions by hit-scoped custom dimensions in BigQuery

I am working with GA data in BigQuery (ga360 export). I have a GA Resource working on a lot of domains. Each has its own Custom Dimension (hit-level scope, index=9) with a value of project name: project1, project2, etc (several domains could be joined in 1 project). What I need is to count visits of users that were visiting only project1 and had no hits on other projects. My query:

SELECT
  date,
  SUM(totals.visits) as visits,
  project
FROM (
  SELECT
    date,
    totals.visits,
    MAX(IF(hits.customDimensions.index=9
        AND REGEXP_MATCH(hits.customDimensions.value, 'project1'), hits.customDimensions.value, NULL)) WITHIN RECORD AS project
  FROM (TABLE_DATE_RANGE([project:dataset.ga_sessions_], TIMESTAMP('2018-03-31'), TIMESTAMP('2018-03-31')))
    )
GROUP BY
  project

The problem is that this query gives me a figure of sessions that have at least 1 hit on project1 regardless of having hits on other projects. I tried excluding by WHERE and HAVING clauses, the result is the same. How could I filter it correctly?

Upvotes: 0

Views: 1044

Answers (1)

Martin Weitzmann
Martin Weitzmann

Reputation: 4746

This is just an untested sketch, but I think it shows the idea - create condition columns in sub-query and use them in the upper query to sum up sessions.

SELECT
  date,
  SUM(IF(isProject1 AND NOT isAnyOtherProject,totals.visits,0)) as visits,
  project
FROM (
  SELECT
    date,
    totals.visits,
    MAX(IF(hits.customDimensions.index=9
        AND REGEXP_MATCH(hits.customDimensions.value, 'project1'), TRUE, FALSE)) WITHIN RECORD AS isProject1,
    MAX(IF(...same as above with different regex identifying other projects...)) isAnyOtherProject
  FROM (TABLE_DATE_RANGE([project:dataset.ga_sessions_], TIMESTAMP('2018-03-31'), TIMESTAMP('2018-03-31')))
    )
GROUP BY
  project

You can move the second MAX-IF also to a OMIT RECORD clause if it makes sense.

Upvotes: 1

Related Questions