Techie
Techie

Reputation: 45124

How does adding GROUP BY or DISTINCT give the same result set?

SELECT unit.id,
       unit.unit_name,
       unit.description,
       unit.category_id,
       city.name,
       mealbase.name AS mealbase_name,
       unit.province_id,
       unit.rooms,
       unit.max_people,
       unit.thumblocation,
       prices.normal_price,
       prices.holiday_price
FROM   jos_units AS unit,
       jos_prices AS prices,
       jos_cities AS city,
       jos_meal_basis AS mealbase
WHERE  prices.unit_id = unit.id
       AND city.id = unit.city_id
       AND unit.published = 1
       AND unit.mealbasis_id = mealbase.id

When I run this query It gives me redundant result set as below.

enter image description here

But If I add

SELECT DISTINCT unit.id Instead of SELECT unit.id at the beginning Or GROUP BY unit.unit.id at the end. It gives me correct result set as below.

enter image description here

My issue is What's wrong with my query(join above gives redundant result even I have corrected joined them)? Why does the adding SELECT DISTINCT unit.id or GROUP BY unit.unit.id is same for the query(which fixes the issue) here? (DISTINCT AND GROUP BY are different functionalities)

Given that I know adding `SELECT DISTINCT unit.id will remove the redundant results but how does the adding one of the two snippet gives same result set? Obviously SELECT DISTINCT unit.id should remove redundant rows by how does the GROUP BY do it?

Upvotes: 0

Views: 325

Answers (4)

Mahmoud Gamal
Mahmoud Gamal

Reputation: 79979

join above gives redundant result even I have corrected joined them why is it?

Thats because of how your tables:

  • jos_units.
  • jos_prices.
  • jos_cities.
  • jos_meal_basis.

are related to each others.

It seems like you have one to many or many to many relations between those tables. For instance, for each record in the jos_meal_basis, each meal has a unit, so many meals might be measured by the same unit, then when joining the two tables you will get redundant units because of this. The same with other tables.

Upvotes: 3

Chris Cameron-Mills
Chris Cameron-Mills

Reputation: 4657

GROUP BY is primarily used if you want to use aggregate or group functions. For example if you wanted to find the number of rows that match you could do

SELECT
  id
, COUNT(id) num_rows
FROM
...
GROUP BY id

because the COUNT is an aggregate function you need to group by the other columns. If you aren't doing any aggregate functions, GROUP BY is essentially just aggregating the rows up (if that's the way you've written it) causing only one row - the same as DISTINCT.

Upvotes: 1

Mariappan Subramanian
Mariappan Subramanian

Reputation: 10083

Your combination in the first query, ie

(unit.id,
       unit.unit_name,
       unit.description,
       unit.category_id,
       city.name,
       mealbase.name AS mealbase_name,
       unit.province_id,
       unit.rooms,
       unit.max_people,
       unit.thumblocation,
       prices.normal_price,
       prices.holiday_price) has duplicates and so you are getting more than 1 rows for the same combination. 

When you use distinct clause or group by it removes duplicates in your above combination. Hope this helps you.

Upvotes: 1

Cynical
Cynical

Reputation: 9588

Basically you are grouping the results without using an aggregation function (using a COUNT, or a MAX, for examples), thus you get the aggregate row in the same way you would obtain it by selecting DISTINCT objects. If you don't need to aggregate them, DISTINCT is the right thing to do.

Upvotes: 3

Related Questions