Renato Rodrigues
Renato Rodrigues

Reputation: 1038

Last order timestamp per product

I want to find the last payment (or NULL if n/a) made for which specified product_id. Below is a representation of the tables I'm working with (simplified version).

+----------+
|Products  |
|----------+
|product_id|
+----------+
+---------------+
|Orders         |
+---------------+
|order_id       |
|order_timestamp|
|order_status   |
+---------------+
+-----------------+
|ProductsOrdersMap|
+-----------------+
|product_id       |
|order_id         |
+-----------------+

After JOINs, MAXs, GROUP BYs, LEFT JOINs, multiple INNER JOINs to get the greatest-n-per-group, I still can't get to the right result. Most of the times, products with multiple orders are returning multiple rows. The best results I got so far were (I was searching specific products):

product_id  order_id  order_timestamp      order_status
8           NULL      NULL                 NULL
9           NULL      NULL                 NULL
10          NULL      NULL                 NULL
12          NULL      NULL                 NULL
13          NULL      NULL                 NULL
14          11        2013-08-13 07:22:01  finished
15          11        2013-08-13 07:22:01  finished
15          12        2013-08-14 00:00:00  finished
32          11        2013-08-13 07:22:01  finished
83          9         2013-08-13 07:04:02  finished
83          10        2013-08-13 07:11:42  finished

Edit: After PP. anwser, I ended up with the following query:

SELECT p.product_id, o.order_id, MAX(order_timestamp) AS order_timestamp, order_status
FROM Products p LEFT JOIN (ProductsOrdersMap m, Orders o)
  ON (p.product_id = m.product_id AND m.order_id = o.order_id)
WHERE p.product_id IN (8,9,10,12,13,14,15,32,83)
GROUP BY p.product_id

Which returns

product_id  order_id  order_timestamp      order_status
8           NULL      NULL                 NULL
9           NULL      NULL                 NULL
10          NULL      NULL                 NULL
12          NULL      NULL                 NULL
13          NULL      NULL                 NULL
14          11        2013-08-13 07:22:01  finished
15          11        2013-08-13 07:22:01  finished
32          11        2013-08-13 07:22:01  finished
83          9         2013-08-13 07:04:02  finished

At first glance, it seems correct but only the products IDs and the timestamps are right. Comparing the two queries above, you can see that, for products 15 and 83, order_id is wrong (order_status might be wrong as well).

Upvotes: 1

Views: 97

Answers (3)

spencer7593
spencer7593

Reputation: 108480

This query should return the specified resultset (this is only desk checked, not tested)

to return ALL product_id

SELECT p.product_id
     , m.order_d
     , m.order_timestamp
     , m.order_status
  FROM products p
  LEFT
  JOIN ( SELECT kl.product_id
              , MAX(ko.order_timestamp) AS latest_timestamp
           FROM orderproductsmap kl
           JOIN orders ko
             ON ko.order_id = kl.order_id
          GROUP
             BY kl.product_id
       ) l
    ON l.product_id = p.product_id
  LEFT
  JOIN ( SELECT ml.product_id
              , mo.order_id
              , mo.order_timestamp
              , mo.order_status
           FROM orderproductsmap ml
           JOIN orders mo
             ON mo.order_id = ml.order_id
       ) m
    ON m.product_id = l.product_id
   AND m.order_timestamp = l.latest_timestamp
 GROUP
    BY p.product_id

The inline view "l" gets us the latest "order_timestamp" for each "product_id". This is joined to inline view "m" to get us the whole row for the order that has the latest timestamp.

If there happens to be more than one order with the same latest "order_timestamp" (i.e. order_timestamp is not guaranteed to be unique for a given product_id) then the outermost GROUP BY ensures that only one of those order rows is returned.

If only particular product_id values need to be returned, add a WHERE clause in the outermost query. For performance, that same predicate can be repeated in the inline views.

to return only SPECIFIC product_id we add three WHERE clauses:

SELECT p.product_id
     , m.order_d
     , m.order_timestamp
     , m.order_status
  FROM products p
  LEFT
  JOIN ( SELECT kl.product_id
              , MAX(ko.order_timestamp) AS latest_timestamp
           FROM orderproductsmap kl
           JOIN orders ko
             ON ko.order_id = kl.order_id
          WHERE kl.product_id IN (8,9,10,12,13,14,15,32,83)
          GROUP
             BY kl.product_id
       ) l
    ON l.product_id = p.product_id
  LEFT
  JOIN ( SELECT ml.product_id
              , mo.order_id
              , mo.order_timestamp
              , mo.order_status
           FROM orderproductsmap ml
           JOIN orders mo
             ON mo.order_id = ml.order_id
          WHERE ml.product_id IN (8,9,10,12,13,14,15,32,83)
       ) m
    ON m.product_id = l.product_id
   AND m.order_timestamp = l.latest_timestamp
 WHERE p.product_id IN (8,9,10,12,13,14,15,32,83)
 GROUP
    BY p.product_id

Only the WHERE clause on the outermost query is required. The other two are added just to improve performance by limiting the size of each of the derived tables.

Upvotes: 2

Ed Gibbs
Ed Gibbs

Reputation: 26363

To return all products, even those without orders, a LEFT JOIN is definitely the way to go. The answer from @PP above uses "old-style" inner joins and is equivalent to this:

SELECT
    P.product_id
    ,MAX(order_timestamp)
FROM Products P
INNER JOIN ProductsOrdersMap M ON P.product_id = M.product_id
INNER JOIN Orders O ON O.order_id = M.order_id
GROUP BY
    P.product_id

Starting with this syntax it's a lot easier to get to the LEFT JOIN - just replace INNER with LEFT:

SELECT
    P.product_id
    ,MAX(order_timestamp)
FROM Products P
LEFT JOIN ProductsOrdersMap M ON P.product_id = M.product_id
LEFT JOIN Orders O ON O.order_id = M.order_id
GROUP BY
    P.product_id

Addendum: Renato needed something more than just reworking the other answer as a LEFT JOIN because the order_id and order_status have to come along with the maximum timestamp. The easiest approach is to start with a list of product ID's and order ID's where the order has the maximum timestamp by order_id:

SELECT
  p2.product_id,
  o2.order_id
FROM Products p2
INNER JOIN ProductsOrdersMap m ON p2.product_id = m.product_id
INNER JOIN Orders o2 ON m.order_id = o2.order_id
WHERE (o2.order_id, o2.order_timestamp) IN (
  SELECT order_id, MAX(order_timestamp)
  FROM Orders
  GROUP BY order_id)

Then, instead of using ProductsOrdersMap to resolve products to orders, use the results from the query above:

SELECT
  p.product_id,
  o.order_id,
  o.TS,
  o.order_status
FROM Products p
LEFT JOIN (
  SELECT
    p2.product_id,
    o2.order_id
  FROM Products p2
  INNER JOIN ProductsOrdersMap m ON p2.product_id = m.product_id
  INNER JOIN Orders o2 ON m.order_id = o2.order_id
  WHERE (o2.order_id, o2.order_timestamp) IN (
    SELECT order_id, MAX(order_timestamp)
    FROM Orders
    GROUP BY order_id)
  ) MaxTS ON p.product_id = MaxTS.product_id
LEFT JOIN Orders o ON MaxTS.order_id = o.order_id

Upvotes: 1

PP.
PP.

Reputation: 10864


    SELECT
        P.product_id
        ,MAX(order_timestamp)
    FROM
        Products P
        ,Orders O
        ,ProductsOrdersMap M
    WHERE
        P.product_id = M.product_id
        AND O.order_id = M.order_id
    GROUP BY
        P.product_id

Upvotes: 1

Related Questions