Samir Said
Samir Said

Reputation: 373

In SQL, why does group by make a difference when using having count()

I have a table that stores zone_id. Sometimes a zone id is twice in the database. I wrote a query to show only entries that have two or more entries of the same zone_id in the table.

The following query returns the correct result:

select *, count(zone_id)
from proxies.storage_used
group by zone_id desc
having count(zone_id) > 1;

However, if I group by last_updated or company_id, it returns random values. If I don't add a group by clause, it only displays one value as per the screenshot below. First output shows above query string, second output shows same query string without the 'group by' line and returns only one value:

correction: I'm a new member and thus can't post pictures directly, so I added it on minus: http://min.us/m3yrlkSMu#1o

While my query works, I don't understand why. Can somebody help me understand why group by is altering the actual output, instead of only the grouping of the output? I am using MySQL.

Upvotes: 1

Views: 888

Answers (2)

nolt2232
nolt2232

Reputation: 2644

A group by divides the resulting rows into groups and performs the aggregate function on the records in each group. If you do a count(*) without a group by you will get a single count of all rows in a table. Since you didn't specify a group by there is only one group, all records in the table. If you do a count(*) with a group by of zone id, you will get a count of how many records there are for each zone id. If you do a count(*) of zone id and last updated date, you will get a count of how many rows were updated on each date in each zone.

Upvotes: 2

mgibsonbr
mgibsonbr

Reputation: 22007

Without a group by clause, everything is stored in the same group, so you get a single result. If there are more than one row in your table, then the having will succeed. So, you'll end up counting all the rows in your table...

source

From what I got, you could create a query with having and without group by only in two situations:

  1. You have a where clause, and you want to test a condition on an aggregation of all rows that satisfy that clause.

  2. Same as above, but for all rows in your table (in practice, it doesn't make sense, though).

Upvotes: 1

Related Questions