Valentin Soveaux
Valentin Soveaux

Reputation: 120

How to group same column values together with SQL?

First off I am sorry if my question is unclear but it is kind of difficult to really express it in one short sentence. Basically, I have a table that looks like this:

+-----------+-------------+
| BookName  | BookAuthor  |
+-----------+-------------+
| A         | Arthur      |
+-----------+-------------+
| A         | Will        |
+-----------+-------------+
| B         | Jack        |
+-----------+-------------+
| B         | Jack        |
+-----------+-------------+
| B         | Charles     |
+-----------+-------------+
| A         | Will        |
+-----------+-------------+
| A         | John        |
+-----------+-------------+

I am looking for a SQL query that could group both BookName and BookAuthor's same values to return, for each book name, which book author has the most entries associated.

So following my example, BookName "A" would have "Will" as book author, and BookName "B" would have "Jack".

I have tried this: SELECT * FROM table GROUP BY BookName, BookAuthor HAVING BookName = "A", but the result that I will get is "Arthur".

Any help appreciated; thanks a lot.

Upvotes: 2

Views: 9124

Answers (5)

Gordon Linoff
Gordon Linoff

Reputation: 1271051

In statistics, this is called the mode. One relatively simple way to do this in MySQL is with two levels of aggregation:

select bookname,
       substring_index(group_concat(bookauthor order by cnt desc), ',', 1) as mode_author
from (select bookname, bookauthor, count(*) as cnt
      from t
      group by bookname, bookauthor
     ) b
group by bookname;

There are some nuances to this. If authors can have commas in their name, then a different separator is needed. Also, if the list of authors exceeds the default maximum length for group_concat(), then that needs to be extended.

MySQL 8+ of course simplifies this by supporting window functions:

select bookname, bookauthor 
from (select bookname, bookauthor, count(*) as cnt,
             row_number() over (partition by bookname order by count(*) desc) as seqnum
      from t
      group by bookname, bookauthor
     ) b
where seqnum = 1;

You do not say what to do in the case of ties. This retrieves one arbitrary best author. But changing row_number() to rank() returns all of them.

Upvotes: 0

Roger Clerkwell
Roger Clerkwell

Reputation: 426

with cte_books
as
(
  select bookname,bookauthor
  ,row_number() over(partition by bookname,bookauthor order by bookname,bookauthor) as [NumOfBooks]
  from elbat
)

select a.bookname,a.bookauthor,a.NumOfBooks
from cte_books a
inner join (
             select bookname, max([NumOfBooks]) as [NumOfBooks] from cte_books group by bookname
           ) as b
on a.bookname = b.bookname
and a.[NumOfBooks] = b.[NumOfBooks]

Upvotes: 0

ScaisEdge
ScaisEdge

Reputation: 133400

If you want the number of entry you could use count() and group by

select BookName, count(*)
from my_table  
group by BookName
order by count(*)

if you want the one with the most entry you could use limit 1

select BookName, count(*)
from my_table  
group by BookName
order by count(*)
limit 1

and for most frequent name in each bookname you could try

select BookName, BookAuthor  ,  count(*)
from my_table  
group by BookNane, BookAuthor  
order by count(*)

Upvotes: 2

sticky bit
sticky bit

Reputation: 37497

You can group by name and author. In a HAVING clause compare the count to another aggregation by name and author in a subquery getting the count but this time filtered for the name and limited to the (a) row with the maximum count.

SELECT t1.bookname,
       t1.bookauthor
       FROM elbat t1
       GROUP BY t1.bookname,
                t1.bookauthor
       HAVING count(*) = (SELECT count(*)
                                 FROM elbat t2
                                 WHERE t2.bookname = t1.bookname
                                 GROUP BY t2.bookname,
                                          t2.bookauthor
                                 ORDER BY count(*) DESC
                                 LIMIT 1);

db<>fiddle

It doesn't break ties though. But you didn't mention whether you need that and how the rules are in such a case.

Upvotes: 2

Vivek
Vivek

Reputation: 405

Select count(BookAuthor) as 'NoOfAuthAsso' from table group by BookName

This will give you no of associated authors for each book

Upvotes: 0

Related Questions