void
void

Reputation: 41

MySQL for each distinct, get certain number of entries

I am somewhat stuck right now and could use some quick MySQL help.

Say I have a table containing reviews for products, columns product_id, customer, date. Now for each distinct product, I want to get the 3 first reviews. ORDER BY date is straightforward, but I'm stuck on how to get exactly 3 entries for each distinct product_id. Since it's all in the same table, INNER JOIN doesn't really make any sense, neither does GROUP BY. So, any ideas?

Edit: I've been successfully using this query on SQLFiddle:

SELECT * FROM (
  SELECT customer, DATE, p_asin, @curTop := 
   CASE WHEN (
     @prevP_asin <> p_asin
   )
   THEN 1 
   ELSE @curTop +1
   END AS Top, @prevP_asin := p_asin
  FROM reviews2 r, (
    SELECT @curTop :=0
  ) A
 ORDER BY r.p_asin, r.date ) B
WHERE top <=3

But for some reason, when I try to apply it to my tables in PhpMyAdmin, the numbering with Top doesn't start by 1 for new products, instead counts all together. How can Fiddle and the actual table behave so differently?

Upvotes: 3

Views: 1222

Answers (3)

T. Dayya
T. Dayya

Reputation: 788

I know it's been a while since the question was asked, but now that SQL 8.0 has been out for a while, figured i might add that we can use ROW_NUMBER() to partition by the key column (product_id in the OP's question.) + order by another column (date in the OP's question.)

In a nut shell, what we want to do is:

  1. Join the tables in question
  2. ORDER BY the key column (product_id) AND by any another columns you desire.
  3. Give row numbers to each record––resetting the count at each distinct value for the key column [each new product_id we encounter]
  4. use the row number from above step as a WHERE condition to select the 1st n rows (e.g. WHERE row_number <= 3 to get the 1st 3 rows for each distinct value from the key column)

Example:

SELECT FULL_TABLE.*
FROM (
         SELECT P.*,
                PD.*,
                ROW_NUMBER() over (
                    PARTITION BY P.id
                    ORDER BY P.date
                    ) AS row_num
         FROM Products AS P
                  LEFT JOIN ProductDetails AS PD on P.id = PD.productId
         ORDER BY P.id, row_num
     ) AS FULL_TABLE
WHERE row_num <= 3;

Upvotes: 1

tylert
tylert

Reputation: 301

The following will only work if we can assume date values are unique for a given product. If that is not true, then you will need to add a unique value into the where statement after "p1.date < p.date" so each row gets a unique rank.

SQLFiddle here

SELECT * FROM (
    SELECT 
      product_id,
      customer, (
      select count(*) as cnt from products p1 
          where p1.product_id = p.product_id and
                p1.date < p.date
        ) rnk
    FROM  
      products P
    ORDER BY
      P.product_id, P.date
    ) A
  WHERE rnk < 3

Upvotes: 2

tylert
tylert

Reputation: 301

In Oracle, you could use rank() over but that isn't available in MySQL. I used this page for an idea on how to do it: rank over in mysql

Also, here it is in SQLFiddle

SELECT * FROM (
SELECT 
  product_id,
  customer,
  @curRank := case when (@prevProductID <> product_id)
               then  1 else   @curRank + 1 
             end as Rnk,
  @prevProductID := product_id
FROM
   products p, (
     SELECT @curRank := 0
   ) r
ORDER BY
  P.product_id, P.date
) A
  WHERE Rnk <= 3

Upvotes: 1

Related Questions