user1032531
user1032531

Reputation: 26281

When is GROUP BY required for aggregate functions?

I have a table called myEntity as follows:

- id (PK INT NOT NULL)
- account_id (FK INT NOT NULL)
- key (INT NOT NULL.  UNIQUE for given account_id)
- name (VARCHAR NOT NULL.  UNIQUE FOR given account_id)

I don't wish to expose the primary key id to the user, and added key for this purpose. key kind of acts as an auto-increment column for a given accounts_id which will need to be manually done by the application. I first planned on making the primary key composite id-account_id, however, the table is joined to other tables, and before I knew it, I had four columns in a table which could have been one. While account_id-name does the same as account_id-key, key is smaller and will minimize network traffic when a client requests multiple records. Yes, I know it isn't properly normalized, and while not my direct question, would appreciate any constructive criticism comments.

Sorry for the rambling... When is GROUP BY required for an aggregate function? For instance, what about the following? https://stackoverflow.com/a/1547128/1032531 doesn't show one. Is it needed?

SELECT COALESCE(MAX(key),0)+1 FROM myEntity WHERE accounts_id=123;

Upvotes: 3

Views: 4796

Answers (2)

O. Jones
O. Jones

Reputation: 108651

You gave a query as an example not requiring GROUP BY. For the sake of explanation, I'll simplify it as follows.

SELECT MAX(key)
  FROM myEntity
 WHERE accounts_id = 123

Why doesn't that query require GROUP BY? Because you only expect one row in the result set, describing a particular account.

What if you wanted a result set describing all your accounts with one row per account? Then you would use this:

 SELECT accounts_id, MAX(key)
   FROM myEntity
  GROUP BY accounts_id

See how that goes? You get one row in this result set for each distinct value of accounts_id. By the way, MySQL's query planner knows that

 SELECT accounts_id, MAX(key)
   FROM myEntity
  WHERE accounts_id = '123'
  GROUP BY accounts_id

is equivalent to the same query omitting the GROUP BY clause.

One more thing to know: If you have a compound index on (accounts_id, key) in your table, all these queries will be almost miraculously fast because the query planner will satisfy them with a very efficient loose index scan. That's specific to MAX() and MIN() aggregate functions. Loose index scans can't bue used for SUM() or AVG() or similar functions; those require tight index scans.

Upvotes: 4

Richard Thomas
Richard Thomas

Reputation: 360

It's only needed when you need it. For example, if you wanted to return all of the keys, you could use

SELECT COALESCE(MAX(key),0)+1 FROM myEntity GROUP BY accounts_id

rather than your select. But your select is fine (though it seems like you may have made things a little hard for yourself with your structure but I don't know what issues you're trying to address)

Upvotes: 0

Related Questions