Reputation: 21
I have been having some difficulties with this issue for a few days and I cannot figure it out.
(Running this is Bigquery)
For a marketing client, I am trying to obtain impressions and clicks per advertiser.
p_impressions has all the impressions data, including IDs Campaign_ID, Advertiser_ID p_click has all the click data, including IDs Campaign_ID, Advertiser_ID match_table_campaigns has campaign data: Campaign_ID, Campaign Name
(essentially my simple goal is to get the count of impressions and clicks and match it to the Campaign table so that I can see the name of the campaign and not just the ID.
When I run a query using UNION all for both the Impression and Click tables, using only the IDS, all the numbers match my platform and everything looks great.
However, when I insert a JOIN (I've tried all of them) to include the name of the campaign, all the metrics of Impressions and click seems to increase and I cannot seem to get the correct numbers anymore.
with combinedDeliveryData AS (
SELECT
FORMAT_DATETIME("%F",DATETIME(TIMESTAMP_MILLIS( CAST(SUBSTR(CAST( Event_Time AS String),0,LENGTH(CAST( Event_Time AS String)) - 3) AS INT64 )),"America/Toronto")) AS Date,
Impression_ID,
DBM_Advertiser_ID,
Campaign_ID,
Ad_ID,
Rendering_ID,
Event_Type,
Advertiser_ID,
FROM
CampaignManager.p_impression
UNION ALL
SELECT
FORMAT_DATETIME("%F",DATETIME(TIMESTAMP_MILLIS( CAST(SUBSTR(CAST( Event_Time AS String),0,LENGTH(CAST( Event_Time AS String)) - 3) AS INT64 )),"America/Toronto")) AS Date,
Impression_ID,
DBM_Advertiser_ID,
Campaign_ID,
Ad_ID,
Rendering_ID,
Event_Type,
Advertiser_ID,
FROM
CampaignManager.p_click)
SELECT cpg.Campaign_ID, cpg.Campaign,data.Date,COUNT(case data.Event_Type when 'VIEW' then 1 else null end) AS Impressions
FROM match_table_campaigns AS cpg
LEFT JOIN combinedDeliveryData AS data ON cpg.Campaign_ID = data.Campaign_ID
WHERE cpg.Advertiser_ID = "4739279"
GROUP BY 3,1,2
Any help would be greatly appreciated. I am not sure what I am missing.
AS a note, here is what works without the JOIN
SELECT
Impression_ID,
Event_Time,
DBM_Advertiser_ID,
Campaign_ID,
Ad_ID,
Rendering_ID,
Event_Type,
Advertiser_ID
FROM
CampaignManager.p_impression
UNION ALL
SELECT
Impression_ID,
Event_Time,
DBM_Advertiser_ID,
Campaign_ID,
Ad_ID,
Rendering_ID,
Event_Type,
Advertiser_ID
FROM
CampaignManager.p_click)
SELECT FORMAT_DATETIME("%F",DATETIME(TIMESTAMP_MILLIS( CAST(SUBSTR(CAST( clicks.Event_Time AS String),0,LENGTH(CAST( clicks.Event_Time AS String)) - 3) AS INT64 )),"America/Toronto")) AS exposureDate,clicks.Campaign_ID,COUNT(case Event_Type when 'VIEW' then 1 else null end) AS Impressions,COUNT(case Event_Type when 'CLICK' then 1 else null end) AS Clicks
FROM
combinedDeliveryData AS clicks
WHERE clicks.Campaign_ID = "4739279"
GROUP BY 1,2
NEW ISSUE: Ive added the following code and things looked great
this_is_deduplicated AS (
SELECT x.*
FROM (
SELECT Campaign_ID, ARRAY_AGG(a LIMIT 1)[OFFSET(0)] x
FROM combinedDeliveryData a
GROUP BY 1
)
)
LEFT JOIN this_is_deduplicated AS data ON cpg.Campaign_ID = data.Campaign_ID
Getting all my values, and they are acurrate. I am now running into a nother issue where I have to join a final table, but the solution above is not working for it.
Values in new table: Date Campaign_ID Activity_ID
I need to now join this last table so that for each Date, I get the COUNT of Activity_ID As Conversiions.
My Final table should be:
Advertiser, Campaign, Impressions, Clicks, Conversions (As above)
Upvotes: 0
Views: 318
Reputation: 59225
I'm almost sure the cause is having multiple identical campaign_id
in the combinedDeliveryData
table.
Make sure there's only one id per campaign before doing the join. With something like:
WITH [...],
this_is_deduplicated AS (
SELECT x.*
FROM (
SELECT Campaign_ID, ARRAY_AGG(a LIMIT 1)[OFFSET(0)] x
FROM combinedDeliveryData a
GROUP BY 1
)
)
And JOIN
against that as in:
LEFT JOIN this_is_deduplicated AS data ON cpg.Campaign_ID = data.Campaign_ID
instead of
LEFT JOIN combinedDeliveryData AS data ON cpg.Campaign_ID = data.Campaign_ID
or a shorter
LEFT JOIN this_is_deduplicated AS data USING(Campaign_ID)
Upvotes: 1
Reputation: 7503
You may need to join on campaign
and campaign_id
. Provided you sample query below
select
t1.date,
t1.campaign_id,
t1.campaign,
sum(case when t1.event = 'view' then 1 else 0 end) as impressions,
sum(case when t2.event = 'click' then 1 else 0 end) as clicks
from table1 as t1
left join table2 as t2
on t1.campaign = t2.campaign
and on t1.campaign_id = t2.campaign_id
where advertiser_id = '12345'
group by
t1.campaign_id,
t1.campaign,
t1.date
Upvotes: 0