How to get a count of unique rows?

Question

I'm trying to get a count of rows that only differ by one record so I can find out what is "historically" the correct row by determining the most frequently occurring combination. The rows will look something like this:

RowAVal1 | RowAVal2 | RowAVal3 | DiffVal1
RowAVal1 | RowAVal2 | RowAVal3 | DiffVal1
RowAVal1 | RowAVal2 | RowAVal3 | DiffVal2
RowAVal1 | RowAVal2 | RowBVal1 | DiffVal1

For this example, for the RowAVal1 | RowAVal2 | RowAVal3 combination, the rows with DiffVal1 would be the historically correct combination because it appears the most. I need to figure out how to count these rows.

Gordon Linoff · Accepted Answer

If I understand correctly, you want the most common value of the fourth column for combinations of the first three. This is called the mode in statistics and is easy to calculate with aggregation and window functions:

select t.*
from (select col1, col2, col3, col4, count(*) as cnt,
             row_number() over (partition by col1, col2, col3 order by count(*) desc) as seqnum
      from t
      group by col1, col2, col3, col4
     ) t
where seqnum = 1;

How to get a count of unique rows?

Answers (1)

Related Questions