Reputation: 42957
I am not so into database and I have the following doubt related a query that I am runngin on a MySql DB.
I have this query:
SELECT
CD.id AS id,
CD.commodity_name_en AS commodity_name_en
FROM MarketDetails AS MD
INNER JOIN MarketDetails_CommodityDetails AS MD_CD
ON MD.id = MD_CD.market_details_id
INNER JOIN CommodityDetails AS CD
on MD_CD.commodity_details_id = CD.id
WHERE MD.localization_id = 1
that returns something like this (representing the list of commodities in some markets):
id commodity_name_en
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 Rice-Asia
2 Apple banana
3 Avocado
4 Red onion
5 White onion
6 Green Beans
7 Bell pepper
8 Carrot
9 Tomatoes
10 Irish potatoes
11 Maize
1 Rice-Asia
3 Avocado
5 White onion
8 Carrot
11 Maize
2 Apple banana
7 Bell pepper
9 Tomatoes
10 Irish potatoes
1 Rice-Asia
As you can see these commodities can appear multiple times (because a specific commodity can be sold in multiple market).
I want to change my query in such a way that every commodities appear only once (because in the end I want the list of all the possible commodities without duplication).
So I know that I can do in this way:
SELECT
CD.id AS id,
CD.commodity_name_en AS commodity_name_en
FROM MarketDetails AS MD
INNER JOIN MarketDetails_CommodityDetails AS MD_CD
ON MD.id = MD_CD.market_details_id
INNER JOIN CommodityDetails AS CD
on MD_CD.commodity_details_id = CD.id
WHERE MD.localization_id = 1
GROUP BY id
I am grouping by ID but I can group by also by name (it is the same).
My doubt is: can I use the distinct statment to obtain the same behavior?
Reading here it seems that it could be a solution: https://www.tutorialspoint.com/sql/sql-distinct-keyword.htm
So I try also this solution and it seems return the same result
SELECT DISTINCT
CD.id AS id,
CD.commodity_name_en AS commodity_name_en
FROM MarketDetails AS MD
INNER JOIN MarketDetails_CommodityDetails AS MD_CD
ON MD.id = MD_CD.market_details_id
INNER JOIN CommodityDetails AS CD
on MD_CD.commodity_details_id = CD.id
WHERE MD.localization_id = 1
So what is the exact difference between the DISTINCT and the GROUP BY solution? And what is the smartest one in a case like mine?
Thank you
Upvotes: 0
Views: 110
Reputation: 94914
Don't join everything only to have to remove from that result set later.
You want to select commodities that exist in certain markets. So you'd select from the commodities table and look up their markets in the WHERE
clause (with an EXISTS
or IN
clause).
select id, commodity_name_en
from commoditydetails
where id in
(
select md_cd.commodity_details_id
from marketdetails_commoditydetails md_cd
join marketdetails md on md.id = md_cd.market_details_id
where md.localization_id = 1
);
Or, if you like that better, even without any join:
select id, commodity_name_en
from commoditydetails
where id in
(
select commodity_details_id
from marketdetails_commoditydetails
where market_details_id in (select id from marketdetails where md.localization_id = 1)
);
Upvotes: 0
Reputation: 1269753
Forget using either version. If you only want columns from one table, use exists
instead:
SELECT CD.id, CD.commodity_name_en
FROM CommodityDetails CD
WHERE EXISTS (SELECT 1
FROM MarketDetails MD INNER JOIN
MarketDetails_CommodityDetails MD_CD
ON MD.id = MD_CD.market_details_id
WHERE MD_CD.commodity_details_id = CD.id AND
MD.localization_id = 1
);
With this version, MySQL does not need to do aggregation on the entire result set -- and that can be a big cost savings. This should be able to take advantage of the indexes used for your original query.
Note: I removed the as
for your column aliases. The default alias for CD.id
is id
. There is no reason to specify this explicitly (unless you really like typing and verbose queries).
As for your specific question, it was answered in the comments -- DISTINCT
and GROUP BY
should have very similar performance.
Upvotes: 5