Reputation: 1013
I want to join two tables, but only get 1 record of table2 per record on table1
For example:
SELECT c.id, c.title, p.id AS product_id, p.title
FROM categories AS c
JOIN products AS p ON c.id = p.category_id
This would get me all records in products
, which is not what I want. I want 1 [the first] product per category (I have a sort
column in the products field).
How do I go about doing that?
Upvotes: 101
Views: 211532
Reputation: 10404
I like more another approach described in a similar question: https://stackoverflow.com/a/11885521/2215679
This approach is better especially in case if you need to show more than one field in SELECT. To avoid Error Code: 1241. Operand should contain 1 column(s)
or double sub-select for each column.
For your situation the Query should looks like (this query also will work in PostgresQL and it is pretty fast, see my update below):
This is the fastest query. Use this one.
SELECT
c.id,
c.title,
p.id AS product_id,
p.title AS product_title
FROM categories AS c
JOIN products AS p ON
p.id = ( --- the PRIMARY KEY
SELECT p1.id FROM products AS p1
WHERE c.id=p1.category_id
ORDER BY p1.id LIMIT 1
)
PS. I did the performance test of the query vs other proposed here, and this query is the best option yet!
UPDATE (2022-07-20, PostgresSQL)
I'm not working with mySQL for a while already, so, I decided to test the performance of my solution (which actually works perfect in both MySQL and PostgresQL) with solution provided by @Gravy in PostgresQL v.12.9.
For that I decided to create a dummy tables and data with 100 categories and 100000 products. You can check the code on this gist
I run my query above and it took only 13ms to run.
After I slightly modified (for postgres) the query from @Gravy:
This is slow, do not use it!
SELECT
id,
category_title,
(array_agg(product_title))[1]
FROM
(SELECT c.id, c.title AS category_title, p.id AS product_id, p.title AS product_title
FROM categories AS c
JOIN products AS p ON c.id = p.category_id
ORDER BY c.id ASC) AS a
GROUP BY id, category_title;
and run it too. It took more than 150ms in my machine. Which is >10x times slower.
In defense of @gravy's solution, I agree with n+1 problem. But, in this particular case, usually the number of products is way larger than categories. So, running through each category is way less expensive than running through each product as in @Gravy's query.
By the way, if your table has 1mln products with 100 categories, the speed of my query is still the same (between 9-17ms), but the query from [@Gravy] takes more than 2 seconds to run
In resume, at this moment, my query is the most performant and optimal solution for the current task.
Feel free to comment.
Upvotes: 130
Reputation: 21
SELECT
c.id,
c.title,
p.id AS product_id,
p.title AS product_title
FROM categories AS c
JOIN products AS p ON
p.id = (
SELECT MIN(p1.id) FROM products AS p1
WHERE c.id=p1.category_id
)
Using MIN or MAX in a subquery will make your query run much faster.
Upvotes: 2
Reputation: 11
In my opinion, this is the best answer (making it general):
SELECT
TB1.Id
FROM Table1 AS TB1
INNER JOIN Table2 AS TB2 ON (TB1.Id = TB2.Id_TB1)
AND TB2.Id = (
SELECT Id
FROM Table2
WHERE TB1.Id = Id_TB1
ORDER BY Table2.Id DESC
LIMIT 1
)
Upvotes: 0
Reputation: 10125
Another example with 3 nested tables: 1/ User 2/ UserRoleCompanie 3/ Companie
SELECT
u.id as userId,
u.firstName,
u.lastName,
u.email,
urc.id ,
urc.companieRole,
c.id as companieId,
c.name as companieName
FROM User as u
JOIN UserRoleCompanie as urc ON u.id = urc.userId
AND urc.id = (
SELECT urc2.id
FROM UserRoleCompanie urc2
JOIN Companie ON urc2.companieId = Companie.id
AND urc2.userId = u.id
AND Companie.isPersonal = false
order by Companie.createdAt DESC
limit 1
)
LEFT JOIN Companie as c ON urc.companieId = c.id
+---------------------------+-----------+--------------------+---------------------------+---------------------------+--------------+---------------------------+-------------------+
| userId | firstName | lastName | email | id | companieRole | companieId | companieName |
+---------------------------+-----------+--------------------+---------------------------+---------------------------+--------------+---------------------------+-------------------+
| cjjt9s9iw037f0748raxmnnde | henry | pierrot | [email protected] | cjtuflye81dwt0748e4hnkiv0 | OWNER | cjtuflye71dws0748r7vtuqmg | leclerc |
Upvotes: 1
Reputation: 4366
When using postgres you can use the DISTINCT ON
syntex to limit the number of columns returned from either table.
Here is a sample of the code:
SELECT c.id, c.title, p.id AS product_id, p.title
FROM categories AS c
JOIN (
SELECT DISTINCT ON(p1.id) id, p1.title, p1.category_id
FROM products p1
) p ON (c.id = p.category_id)
The trick is not to join directly on the table with multiple occurrences of the id, rather, first create a table with only a single occurrence for each id
Upvotes: -1
Reputation: 41
The With clause would do the trick. Something like this:
WITH SELECTION AS (SELECT id FROM products LIMIT 1)
SELECT a.id, c.id, c.title FROM selection a JOIN categories c ON (c.id = a.id);
Upvotes: 4
Reputation: 12445
Accepted answer by @goggin13 looks wrong. Other solutions provided to-date will work, but suffer from the n+1 problem and as such, suffer a performance hit.
n+1 problem: If there are 100 categories, then we would have to do 1 select to get the categories, then for each of the 100 categories returned, we would need to do a select to get the products in that category. So 101 SELECT queries would be performed.
My alternative solution solves the n+1 problem and consequently should be significantly more performant as only 2 selects are being performed.
SELECT
*
FROM
(SELECT c.id, c.title, p.id AS product_id, p.title
FROM categories AS c
JOIN products AS p ON c.id = p.category_id
ORDER BY c.id ASC) AS a
GROUP BY id;
Upvotes: 34
Reputation: 1
Replace the tables with yours:
SELECT * FROM works w
LEFT JOIN
(SELECT photoPath, photoUrl, videoUrl FROM workmedias LIMIT 1) AS wm ON wm.idWork = w.idWork
Upvotes: -3
Reputation: 2277
SELECT c.id, c.title, p.id AS product_id, p.title
FROM categories AS c
JOIN products AS p ON c.id = p.category_id
GROUP BY c.id
This will return the first data in products (equals limit 1)
Upvotes: 17
Reputation: 7976
I would try something like this:
SELECT C.*,
(SELECT P.id, P.title
FROM products as P
WHERE P.category_id = C.id
LIMIT 1)
FROM categories C
Upvotes: -43
Reputation: 2148
What about this?
SELECT c.id, c.title, (SELECT id from products AS p
WHERE c.id = p.category_id
ORDER BY ...
LIMIT 1)
FROM categories AS c;
Upvotes: 7
Reputation: 62359
Assuming you want product with MIN()
imial value in sort
column, it would look something like this.
SELECT
c.id, c.title, p.id AS product_id, p.title
FROM
categories AS c
INNER JOIN (
SELECT
p.id, p.category_id, p.title
FROM
products AS p
CROSS JOIN (
SELECT p.category_id, MIN(sort) AS sort
FROM products
GROUP BY category_id
) AS sq USING (category_id)
) AS p ON c.id = p.category_id
Upvotes: 0