John Doe
John Doe

Reputation: 33

Count and Distinct SQL query

I have a table of Books, that contains the following columns:

Book_Id   User_Id
001       1
002       2
001       1
004       2
005       3
006       3
007       2
008       2
009       1

Where :

Book_Id - identifier of a book that a user read; User_Id - identifier of a reader/user.

Let's assume that User1 read books three times, but 2 of them were same, so the user 1 read 2 distinct books (001 and 009). User 2 read 4 distinct books, while user 3 read 2 distinct books. In overall, there are 2 users that read 2 distinct books, and 1 user that read 4 distinct books. The output expected is as below:

Distinct_Books_Count --- User_Count
2                           2
4                           1

I tried the following:

SELECT COUNT(DISTINCT Book_Id), COUNT(User_Id) FROM Books GROUP BY User_Id

But I receive the following table:

Distinct_Books_Count  User_Count
2                     3
4                     4
2                     2

So any alternative solution or changes?

Upvotes: 3

Views: 387

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1271051

I call this a "histogram of histograms" query. You can do it using two group bys:

SELECT num_books, COUNT(*)
FROM (SELECT b.User_Id, COUNT(DISTINCT Book_Id) as num_books
      FROM Books b
      GROUP BY User_Id
     ) b
GROUP BY num_books
ORDER BY num_books;

Upvotes: 2

Related Questions