heyooo678
heyooo678

Reputation: 91

Duplicate records upon joining table

I am still very new to SQL and Tableau however I am trying to work myself towards achieving a personal project of mine.

Table A; shows a table which contains the defect quantity per product category and when it was raised

+--------+-------------+--------------+-----------------+
| Issue# | Date_Raised | Category_ID# | Defect_Quantity | 
+--------+-------------+--------------+-----------------+
| PCR12  | 11-Jan-2019 | Product#1    |              14 |
| PCR13  | 12-Jan-2019 | Product#1    |              54 |
| PCR14  | 5-Feb-2019  | Product#1    |               5 |
| PCR15  | 5-Feb-2019  | Product#2    |               7 | 
| PCR16  | 20-Mar-2019 | Product#1    |              76 | 
| PCR17  | 22-Mar-2019 | Product#2    |               5 | 
| PCR18  | 25-Mar-2019 | Product#1    |              89 | 
+--------+-------------+--------------+-----------------+

Table B; shows the consumption quantity of each product by month

+-------------+--------------+-------------------+
| Date_Raised | Category_ID# | Consumed_Quantity |
+-------------+--------------+-------------------+
| 5-Jan-2019  | Product#1    | 100               |
| 17-Jan-2019 | Product#1    | 200               |
| 5-Feb-2019  | Product#1    | 100               |
| 8-Feb-2019  | Product#2    | 50                |
| 10-Mar-2019 | Product#1    | 100               |
| 12-Mar-2019 | Product#2    | 50                |
+-------------+--------------+-------------------+

END RESULT

I would like to create a table/bar chart in tableau that shows that Defect_Quantity/Consumed_Quantity per month, per Category_ID#, so something like this below;

+----------+-----------+-----------+
|  Month   | Product#1 | Product#2 |
+----------+-----------+-----------+
| Jan-2019 | 23%       |           |
| Feb-2019 | 5%        | 14%       |
| Mar-2019 | 89%       | 10%       |
+----------+-----------+-----------+

WHAT I HAVE TRIED SO FAR Unfortunately i have not really done anything, i am struggling to understand how do i get rid of the duplicates upon joining the tables based on Category_ID#.

Appreciate all the help I can receive here.

Upvotes: 1

Views: 73

Answers (3)

Gordon Linoff
Gordon Linoff

Reputation: 1270713

I would do this as:

select to_char(date_raised, 'YYYY-MM'),
       (sum(case when product = 'Product#1' then defect_quantity end) /
        sum(case when product = 'Product#1' then consumed_quantity end)
       ) as product1,
       (sum(case when product = 'Product#2' then defect_quantity end) /
        sum(case when product = 'Product#2' then consumed_quantity end)
       ) as product2        
from ((select date_raised, product, defect_quantity, 0 as consumed_quantity
       from a
      ) union all
      (select date_raised, product, 0 as defect_quantity, consumed_quantity
       from b
      )
     ) ab
group by to_char(date_raised, 'YYYY-MM')
order by min(date_raised);

(I changed the date format because I much prefer YYYY-MM, but that is irrelevant to the logic.)

Why do I prefer this method? This will include all months where there is a row in either table. I don't have to worry that some months are inadvertently filtered out, because there are missing production or defects in one month.

Upvotes: 1

Ed Bangga
Ed Bangga

Reputation: 13016

I can think of doing left joins on both product1 and 2.

select to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')
    , (p2.product1 - sum(case when category_id='Product#1' then Defect_Quantity else 0 end))/p2.product1 * 100 
    , (p2.product2 - sum(case when category_id='Product#2' then Defect_Quantity else 0 end))/p2.product2 * 100  
from tableA t1
left join 
    (select to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy') Date_Raised
        , sum(Comsumed_Quantity) as product1 tableB  
        where category_id = 'Product#1'
        group by to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')) p1
    on p1.Date_Raised = t1.Date_Raised  
left join 
    (select to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy') Date_Raised
        , sum(Comsumed_Quantity) as product2 tableB  
        where category_id = 'Product#2'
        group by to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')) p2
    on p2.Date_Raised = t1.Date_Raised
group by to_char(to_date(Date_Raised,'d-mon-yyyy'),'mon-yyyy')

Upvotes: 2

kota
kota

Reputation: 51

By using ROW_NUMBER() OVER (PARTITION BY ORDER BY ) as RN, you can remove duplicate rows. As of your end result you should extract month from date and use pivot to achieve.

Upvotes: 1

Related Questions