Reputation: 33
I have a table of Books, that contains the following columns:
Book_Id User_Id
001 1
002 2
001 1
004 2
005 3
006 3
007 2
008 2
009 1
Where :
Book_Id - identifier of a book that a user read; User_Id - identifier of a reader/user.
Let's assume that User1 read books three times, but 2 of them were same, so the user 1 read 2 distinct books (001 and 009). User 2 read 4 distinct books, while user 3 read 2 distinct books. In overall, there are 2 users that read 2 distinct books, and 1 user that read 4 distinct books. The output expected is as below:
Distinct_Books_Count --- User_Count
2 2
4 1
I tried the following:
SELECT COUNT(DISTINCT Book_Id), COUNT(User_Id) FROM Books GROUP BY User_Id
But I receive the following table:
Distinct_Books_Count User_Count
2 3
4 4
2 2
So any alternative solution or changes?
Upvotes: 3
Views: 387
Reputation: 1271051
I call this a "histogram of histograms" query. You can do it using two group by
s:
SELECT num_books, COUNT(*)
FROM (SELECT b.User_Id, COUNT(DISTINCT Book_Id) as num_books
FROM Books b
GROUP BY User_Id
) b
GROUP BY num_books
ORDER BY num_books;
Upvotes: 2