stkvtflw
stkvtflw

Reputation: 13547

BigQuery: Why am I getting more rows after adding "LEFT JOIN"?

WITH rawData AS (
  SELECT
    wholesaler_id,
    month,
    day,
    year,
    product_volume,
    product_gp,
    buyer_id,
    salesrep_id AS default_salesrep_id,
    salesrep_name,
    fpl
  FROM tables.transactions
  WHERE wholesaler_id = 'hos'
  AND year IN (2019, 2018)
  AND month = 12
),
salesRepsToBuyersAssociation AS (
  SELECT DISTINCT
    default_salesrep_id AS salesrep_id,
    buyer_id
  FROM rawData
)

This query returns 60636 rows:

SELECT *
FROM rawData

This query returns 72039 rows:

SELECT
  rawData.*,
  salesrep_id
FROM rawData
LEFT JOIN salesRepsToBuyersAssociation USING (buyer_id)

LEFT JOIN suppose to join only matching salesRepsToBuyersAssociation rows to rawData, right? Why on earth am I getting more rows than exist in the rawData?

Upvotes: 1

Views: 717

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270021

Because there are multiple rows in salesRepsToBuyersAssociation for one or more buyer_ids.

You can check this:

select buyer_id, count(*)
from salesRepsToBuyersAssociation
group by buyer_id
having count(*) > 1;

Upvotes: 6

Related Questions