Reputation: 537
, v2p AS (
SELECT DISTINCT video_asin, product_asin
from video_table
where video_asin IN (
‘ABC’,
‘CDF’,
‘DEF’,
‘FRW’)
)
, video_data AS (
SELECT distinct
start_date::date,
video_view,
video_asin,
page_asin
FROM video_metrics
WHERE
page_asin IN (select DISTINCT product_asin from v2p)
AND start_date between TO_DATE('01/26/2018','MM-DD-YYYY') and TO_DATE('02/04/2018','MM-DD-YYYY') + 0.9999
)
select sum(video_view) from video_data;
I’m running simple query, but I’m getting strange results with IN and NOT IN .
(1). When I do page_asin IN (select DISTINCT product_asin from v2p),
I am getting 1,109,567 as select sum(video_view) from video_data;
(2). When I do the same query but with NOT IN page_asin NOT IN (select DISTINCT product_asin from v2p),
I am getting 7,032,405 as sum.
(3). If I remove this whole line (page_asin IN (select DISTINCT product_asin from v2p),
) to get all results, I am getting 8,148,803. as sum.
But I supposed that if I add (1) and (2) I should get (3). But in reality, I am getting: 1,109,567 + 7,032,405 = 8,141,972 AND NOT 8,148,803. from (3).
Why that happens? Why I am missing ~7000 video views?
Upvotes: 2
Views: 4889
Reputation: 1270713
A NULL
value would be neither IN
nor NOT IN
a list.
So, page_asin
must be NULL
on some rows.
As a note: The select distinct
is redundant in the subquery. There is no reason to include it.
Upvotes: 4