Finding the row with most common attribute using SQL

Question

I have the following table in my database:

user_id | p1 | p2 | p3
1       | x  | y  | z
2       | x  | x  | x
3       | y  | y  | z

I need to find the row(s) that contains the most common value between that same row. i.e., the first row has no common value, the second contain three common values and the third one contains two common values.

Then, the output in this case should be

user_id | p1 | p2 | p3
2       | x  | x  | x

Any ideas? (It would be nice if the solution did not require a vendor-specific feature, but anything will help).

Martin Smith · Accepted Answer

For a non vendor specific solution You could do

SELECT *
FROM YourTable
ORDER BY 
CASE WHEN p1=p2 THEN 1 ELSE 0 END +
CASE WHEN p1=p3 THEN 1 ELSE 0 END +
CASE WHEN p2=p3 THEN 1 ELSE 0 END DESC

And then LIMIT, TOP, ROW_NUMBER or whatever dependant upon RDBMS to just get the top row.

But if you have a specific RDBMS in mind there may be other ways that are more maintainable for larger number of columns (e.g. for SQL Server 2008)

SELECT TOP 1 *
FROM YourTable
ORDER BY 
(SELECT COUNT (DISTINCT p) FROM (VALUES(p1),(p2),(p3)) T(p))

Also how do you want ties handled?

Finding the row with most common attribute using SQL

Answers (1)

Related Questions