Marc
Marc

Reputation: 3313

How to find missing data in table Sql

This is similar to How to find missing data rows using SQL? and How to find missing rows (dates) in a mysql table? but a bit more complex, so I'm hitting a wall.

I have a data table with the noted Primary key:

country_id (PK)
product_id (PK)
history_date (PK)
amount

I have a products table with all products, a countries table, and a calendar table with all valid dates.

I'd like to find all countries, dates and products for which there are missing products, with this wrinkle: I only care about dates for which there are entries for a country for at least one product (i.e. if the country has NOTHING on that day, I don't need to find it) - so, by definition, there is an entry in the history table for every country and date I care about.

I know it's going to involve some joins maybe a cross join, but I'm hitting a real wall in finding missing data.

I tried this (pretty sure it wouldn't work):

SELECT h.history_date, h.product_id, h.country_id, h.amount
FROM products p
LEFT JOIN history h ON (p.product_id = h.product_id)
WHERE h.product_id IS NULL

No Joy.

I tried this too:

WITH allData AS (SELECT h1.country_id, p.product_id, h1.history_date
FROM products p 
CROSS JOIN (SELECT DISTINCT country_id, history_date FROM history) h1)
SELECT f.history_date, f.product_id, f.country_id
FROM allData f
LEFT OUTER JOIN history h ON (f.country_id = h.country_id AND f.history_date = h.history_date AND f.product_id = h.product_id)
WHERE h.product_id IS NULL 
   AND h.country_id IS NOT NULL 
   AND h.history_date IS NOT null

also no luck. The CTE does get me every product on every date that there is also data, but the rest returns nothing.

Upvotes: 0

Views: 896

Answers (1)

Andrew Sayer
Andrew Sayer

Reputation: 2336

I only care about dates for which there are entries for a country for at least one product (i.e. if the country has NOTHING on that day, I don't need to find it)

So we care about this combination:

from    (select distinct country_id, history_date from history) country_date
cross join products p

Then it's just a matter of checking for existence:

select *
from    (select distinct country_id, history_date from history) country_date
cross join products p
where not exists (select null
                  from history h
                  where country_date.country_id   = h.country_id
                   and  country_date.history_date = h.history_date
                   and  p.product_id              = h.product_id
                 ) 

Upvotes: 1

Related Questions