Tam
Tam

Reputation: 11

Left Join not returning the same number of rows against original GBQ query

I'm having a really hard time with this one.

I have a query that results in 40,194,047 rows.

When I try to left join the table from that original query with another, I do not pull back the same number of rows. I'd really like to have all the nulls from the left join but don't trust that my query is correct since the # of results are not the same. I'm only getting 40,120,219 rows.

Here's an example of my original query:

select *
from `exampletablea`
WHERE 1=1 
  and RETL_AMT = 0
  and not RSN_CD in (23,41,43,24,25,27) --excluded reason codes 
  and BUS_DT BETWEEN '2019-02-04' AND '2020-02-02'

And here's an example of my join query:

select distinct 

a.FSCL_YR_WK_KEY_VAL    
,a.STR_NBR AS STORE    
,A.MKUP_MKDN_CTRL_NBR    
,A.MKDN_RSN_TXT    
,A.EFF_BGN_DT AS MUMD_EFF_BGN_DATE    
,A.RGSTR_NBR    
,A.POS_TRANS_ID    
,A.MKUP_MKDN_RSN_CD    
,A.MKUP_MKDN_TYP_CD    
,A.SKU_NBR AS SKU       
,A.NET_MUMD_AMT    
,A.MKUP_MKDN_RETL_AMT    
, b.STR_NBR
, b.SKU_NBR    
, b.SKU_CRT_DT    
, A.BUS_DT
, b.EFF_BGN_DT
, b.EFF_END_DT       
, b.SKU_STAT_CD

from `exampletablea`

left join `exambletableb`
  on a.sku_nbr = b.sku_nbr
  and a.str_nbr = b.str_nbr
  and a.bus_dt between b.EFF_BGN_DT and b.EFF_END_DT

where 1=1
  and RETL_AMT = 0
  and not RSN_CD in (23,41,43,24,25,27) --excluded reason codes 
  and BUS_DT BETWEEN '2019-02-04' AND '2020-02-02'

I can't for the life of me figure out what I am doing wrong. I'm definitely new at SQL but I've tried all different join combos, moving all the where clauses to the join, etc.

I've tried using solutions from similar questions posted here but I haven't been success there either.

Any help would be greatly appreciated it!

Upvotes: 0

Views: 271

Answers (1)

Pentium10
Pentium10

Reputation: 207952

The first query selects ALL, then you have a query with DISTINCT and your JOIN has that, hence resulting a smaller dataset.

Upvotes: 2

Related Questions