Reputation: 120
First off I am sorry if my question is unclear but it is kind of difficult to really express it in one short sentence. Basically, I have a table that looks like this:
+-----------+-------------+
| BookName | BookAuthor |
+-----------+-------------+
| A | Arthur |
+-----------+-------------+
| A | Will |
+-----------+-------------+
| B | Jack |
+-----------+-------------+
| B | Jack |
+-----------+-------------+
| B | Charles |
+-----------+-------------+
| A | Will |
+-----------+-------------+
| A | John |
+-----------+-------------+
I am looking for a SQL query that could group both BookName and BookAuthor's same values to return, for each book name, which book author has the most entries associated.
So following my example, BookName "A" would have "Will" as book author, and BookName "B" would have "Jack".
I have tried this:
SELECT * FROM table GROUP BY BookName, BookAuthor HAVING BookName = "A"
, but the result that I will get is "Arthur".
Any help appreciated; thanks a lot.
Upvotes: 2
Views: 9124
Reputation: 1271051
In statistics, this is called the mode. One relatively simple way to do this in MySQL is with two levels of aggregation:
select bookname,
substring_index(group_concat(bookauthor order by cnt desc), ',', 1) as mode_author
from (select bookname, bookauthor, count(*) as cnt
from t
group by bookname, bookauthor
) b
group by bookname;
There are some nuances to this. If authors can have commas in their name, then a different separator is needed. Also, if the list of authors exceeds the default maximum length for group_concat()
, then that needs to be extended.
MySQL 8+ of course simplifies this by supporting window functions:
select bookname, bookauthor
from (select bookname, bookauthor, count(*) as cnt,
row_number() over (partition by bookname order by count(*) desc) as seqnum
from t
group by bookname, bookauthor
) b
where seqnum = 1;
You do not say what to do in the case of ties. This retrieves one arbitrary best author. But changing row_number()
to rank()
returns all of them.
Upvotes: 0
Reputation: 426
with cte_books
as
(
select bookname,bookauthor
,row_number() over(partition by bookname,bookauthor order by bookname,bookauthor) as [NumOfBooks]
from elbat
)
select a.bookname,a.bookauthor,a.NumOfBooks
from cte_books a
inner join (
select bookname, max([NumOfBooks]) as [NumOfBooks] from cte_books group by bookname
) as b
on a.bookname = b.bookname
and a.[NumOfBooks] = b.[NumOfBooks]
Upvotes: 0
Reputation: 133400
If you want the number of entry you could use count() and group by
select BookName, count(*)
from my_table
group by BookName
order by count(*)
if you want the one with the most entry you could use limit 1
select BookName, count(*)
from my_table
group by BookName
order by count(*)
limit 1
and for most frequent name in each bookname you could try
select BookName, BookAuthor , count(*)
from my_table
group by BookNane, BookAuthor
order by count(*)
Upvotes: 2
Reputation: 37497
You can group by name and author. In a HAVING
clause compare the count to another aggregation by name and author in a subquery getting the count but this time filtered for the name and limited to the (a) row with the maximum count.
SELECT t1.bookname,
t1.bookauthor
FROM elbat t1
GROUP BY t1.bookname,
t1.bookauthor
HAVING count(*) = (SELECT count(*)
FROM elbat t2
WHERE t2.bookname = t1.bookname
GROUP BY t2.bookname,
t2.bookauthor
ORDER BY count(*) DESC
LIMIT 1);
It doesn't break ties though. But you didn't mention whether you need that and how the rules are in such a case.
Upvotes: 2
Reputation: 405
Select count(BookAuthor) as 'NoOfAuthAsso' from table group by BookName
This will give you no of associated authors for each book
Upvotes: 0