Reputation: 273
I recently struggled with this question on a test for a job. I have the following table.
TABLE sessions
id INTEGER primary key;
userId INTEGER NOT NULL;
duration INTEGER NOT NULL;
I needed to write a query that selected userID and the average duration for each user with more than 1 session. In other words, I needed to average duration for all userIds that appeared more than once in the table. Can anyone help me with this?
Upvotes: 2
Views: 4910
Reputation: 1269773
You would simple use a having
clause:
select userid, avg(duration)
from sessions
group by userid
having count(*) > 1;
That would be the "accepted" answer. Often the fastest approach would be:
select userid, avg(duration)
from sessions s
where exists (select 1 from sessions s2 where s2.userid = s.userid and s2.id <> s.id)
group by userid;
This would be faster if many users are one-and-done and you have an index on (userid, id)
. Why? Because the query filters quickly before the aggregation and significantly reducing the number of rows speeds up the aggregation more than the cost of the check (under the given circumstances).
And, in a database that does integer arithmetic, you might want avg(duration * 1.0)
.
However, an interview candidate would pass on the first query.
Upvotes: 11