Reputation: 11
I'm having a really hard time with this one.
I have a query that results in 40,194,047 rows.
When I try to left join the table from that original query with another, I do not pull back the same number of rows. I'd really like to have all the nulls from the left join but don't trust that my query is correct since the # of results are not the same. I'm only getting 40,120,219 rows.
Here's an example of my original query:
select *
from `exampletablea`
WHERE 1=1
and RETL_AMT = 0
and not RSN_CD in (23,41,43,24,25,27) --excluded reason codes
and BUS_DT BETWEEN '2019-02-04' AND '2020-02-02'
And here's an example of my join query:
select distinct
a.FSCL_YR_WK_KEY_VAL
,a.STR_NBR AS STORE
,A.MKUP_MKDN_CTRL_NBR
,A.MKDN_RSN_TXT
,A.EFF_BGN_DT AS MUMD_EFF_BGN_DATE
,A.RGSTR_NBR
,A.POS_TRANS_ID
,A.MKUP_MKDN_RSN_CD
,A.MKUP_MKDN_TYP_CD
,A.SKU_NBR AS SKU
,A.NET_MUMD_AMT
,A.MKUP_MKDN_RETL_AMT
, b.STR_NBR
, b.SKU_NBR
, b.SKU_CRT_DT
, A.BUS_DT
, b.EFF_BGN_DT
, b.EFF_END_DT
, b.SKU_STAT_CD
from `exampletablea`
left join `exambletableb`
on a.sku_nbr = b.sku_nbr
and a.str_nbr = b.str_nbr
and a.bus_dt between b.EFF_BGN_DT and b.EFF_END_DT
where 1=1
and RETL_AMT = 0
and not RSN_CD in (23,41,43,24,25,27) --excluded reason codes
and BUS_DT BETWEEN '2019-02-04' AND '2020-02-02'
I can't for the life of me figure out what I am doing wrong. I'm definitely new at SQL but I've tried all different join combos, moving all the where clauses to the join, etc.
I've tried using solutions from similar questions posted here but I haven't been success there either.
Any help would be greatly appreciated it!
Upvotes: 0
Views: 271
Reputation: 207952
The first query selects ALL
, then you have a query with DISTINCT
and your JOIN has that, hence resulting a smaller dataset.
Upvotes: 2