rohit vernekar
rohit vernekar

Reputation: 77

MYSQL get product records from history table for two different date with only latest data if duplicate found

I have a transaction table which holds different types of products which hold two types of products which is stored in others field (json field) tagged as product_type1 and product_type2 and some products can be tagged as both.

Transaction table:

id, hash_value, product_name,price, date , others

1, abc, product_name1, 20, 2018-10-21, {"product_type1": 1}
2, def, product_name2, 30, 2018-10-22, {"product_type2": 1}
3, ghi, product_name3, 40, 2018-10-21, {"product_type1": 1, "product_type2": 1}

Unique key (id, hash_value)

For the above mentioned table, we have a history table which stores the historical products for the above products in transaction table (schema is same like transaction table , but unique key is on id,hash_value,date)

History table

id, hash_value, product_name, price, date , others
1, abc, product_name1, 20, 2018-10-21, {"product_type1": 1}
2, abc, product_name1, 18, 2018-10-20, {"product_type1": 1}
3, abc, product_name1, 19, 2018-10-19, {"product_type1": 1}
4, def, product_name2, 30, 2018-10-22, {"product_type2": 1}
4, def, product_name2, 29, 2018-10-21, {"product_type2": 1}
4, def, product_name2, 40, 2018-10-20, {"product_type2": 1}
5, ghi, product_name3, 40, 2018-10-21, {"product_type1": 1, "product_type2": 1}

5, ghi, product_name3, 50, 2018-10-22, {"product_type1": 1, "product_type2": 1}

I would like to query records with following condition. 1. Get all records tagged as product_type1 for date 2018-10-21 2. Get all records tagged as product_type2 for date 2018-10-22 3. If tagged as both then, get records having latest date.

The query i tried.

SELECT *
FROM   ((SELECT *
         FROM   history_table
         WHERE  date = '2018-10-21'
                AND others ->> '$.product_type1' == 1)
        UNION ALL
        (SELECT *
         FROM   history_table
         WHERE  date = '2018-10-22'
                AND others ->> '$.product_type2' == 1))  

but the result i get is

id, hash_value, product_name, price, date , others
1, abc, product_name1, 20, 2018-10-21, {"product_type1": 1}
4, def, product_name2, 30, 2018-10-22, {"product_type2": 1}
5, ghi, product_name3, 40, 2018-10-21, {"product_type1": 1, "product_type2": 1}
5, ghi, product_name3, 50, 2018-10-22, {"product_type1": 1, "product_type2": 1}

Result should only give record with latest date if duplicate record for id and hash_value exists

Results Expected:

id, hash_value, product_name, price, date , others
1, abc, product_name1, 20, 2018-10-21, {"product_type1": 1}
4, def, product_name2, 30, 2018-10-22, {"product_type2": 1}
5, ghi, product_name3, 50, 2018-10-22, {"product_type1": 1, "product_type2": 1}

Upvotes: 1

Views: 142

Answers (1)

Madhur Bhaiya
Madhur Bhaiya

Reputation: 28854

  • In a Derived table (sub-select query), identify the maximum date value for a combination of id and hash_value matching the prescribed conditions.
  • Now, simply join this result-set to the history_table to get the rows corresponding to maximum date value only.
  • Instead of using Union All to combine two different conditional Select results; you can directly use OR in the Where condition here. it will be significantly more efficient.

Try the following query:

SELECT ht.* FROM 
history_table AS ht 
JOIN 
(
  SELECT id, hash_value, MAX(date) AS max_date 
  FROM   history_table
  WHERE  (date = '2018-10-21' AND others ->> '$.product_type1' == 1) 
          OR 
         (date = '2018-10-22' AND others ->> '$.product_type2' == 1) 
  GROUP BY id, hash_value 
) AS dt ON dt.id = ht.id AND 
           dt.hash_value = ht.hash_value AND 
           dt.max_date = ht.date 

Upvotes: 1

Related Questions