Reputation: 97
I'm trying to filter data where I'm extracting from Google Analytics data sessions, clientid and a custom dimension. The custom dimension is only set in some cases and we, unfortunately do not set a default value. It can therefore be blank or null. The values all have a format which looks like this: 5lkfzls6e5xt8aazyygsop. They all start with a number. I thought my expression below would therefore only return custom dimensions which have something present, but I still get all other values returned.
SELECT date,
SUM(totals.visits) AS sessions,
fullVisitorId,
max(case when hits.customdimensions.index = 30 then hits.customdimensions.value end) dogs
FROM [data_source]
WHERE REGEXP_MATCH (hits.customdimensions.value, '^[0-9]')
GROUP BY date, fullVisitorId
ORDER BY dogs DESC
Edit, added sample data:
Upvotes: 0
Views: 79
Reputation: 172993
As you are using Legacy SQL - bellow is for Legacy SQL
1 - Below, which looks like exactly query in yo question does work!!!
#legacySQL
SELECT DATE,
SUM(totals.visits) AS sessions,
fullVisitorId,
MAX(CASE WHEN hits.customdimensions.index = 3 THEN hits.customdimensions.value END) dogs
FROM [project:dataset.table]
WHERE REGEXP_MATCH(hits.customdimensions.value, r'^[0-9]')
GROUP BY DATE, fullVisitorId
ORDER BY dogs DESC
2 - As alternative - try below version
#legacySQL
SELECT DATE,
SUM(totals.visits) AS sessions,
fullVisitorId,
MAX(CASE WHEN hits.customdimensions.index = 3 THEN hits.customdimensions.value END) dogs
FROM [project:dataset.table]
GROUP BY DATE, fullVisitorId
HAVING IFNULL(dogs, '') != ''
ORDER BY dogs DESC
3 - I just tested both and they both work - if they do not work for you - it means only that something with your data that you don't specify in you question
Upvotes: 1
Reputation: 521279
From the documentation, your syntax might be slightly off and you should be doing this:
WHERE REGEXP_MATCH (hits.customdimensions.value, r'^[0-9]')
^^^ you omitted the r
Upvotes: 0