Reputation: 210
Imagine that I have the next two SQL Server tables:
CREATE TABLE Users (
id INT IDENTITY(1, 1) PRIMARY KEY,
name VARCHAR(100) NOT NULL
)
CREATE TABLE UserLogins (
id INT IDENTITY(1, 1) PRIMARY KEY,
user_id INT REFERENCES Users(id) NOT NULL,
login VARCHAR(100) NOT NULL
)
And I need to get a count of user logins for each user. And the query result should contain user name, for example. Which query will work faster:
SELECT MAX(name), count(*)
FROM Users u
INNER JOIN UserLogins ul ON ul.user_id = u.id
GROUP BY u.id
or the next one:
SELECT name, count(*)
FROM Users u
INNER JOIN UserLogins ul ON ul.user_id = u.id
GROUP BY u.name
So, I'm not sure, if it will be better to group by the column with an index and then use MAX
or MIN
aggregate function. Or just group by Users.name
, which doesn't have any indexes.
Thank you in advance!
Upvotes: 0
Views: 1318
Reputation: 71544
The answer is: neither is really correct.
The second version is completely wrong as name
is not unique. The first version is correct, although it may not be efficient.
Since name
has a functional dependency on id
, every unique value of id
also defines a value of name
. Grouping by name
is wrong, because name
is not necessarily unique. Grouping only by id
means you need to aggregate name
, which makes no sense if there is a functional dependency. So you actually want to group by both columns:
SELECT
u.name,
count(*)
FROM Users u
INNER JOIN UserLogins ul ON ul.user_id = u.id
GROUP BY
u.id,
u.name;
Note that id
does not actually need to be selected.
This query is almost certainly going to be faster than grouping by name
alone, because the server cannot deduce that name
is unique and needs to sort and aggregate it.
It may also be faster than grouping by id
, although that may depend on whether the optimizer is clever enough to deduce the functional dependency (and therefore no aggregation would be necessary). Even if it isn't clever, this probably won't be slow, as id
is already unique, so a scan of an index over id
would not require a sort, only aggregation.
Upvotes: 2