Reputation: 5
I'm trying to get a count of rows that only differ by one record so I can find out what is "historically" the correct row by determining the most frequently occurring combination. The rows will look something like this:
RowAVal1 | RowAVal2 | RowAVal3 | DiffVal1
RowAVal1 | RowAVal2 | RowAVal3 | DiffVal1
RowAVal1 | RowAVal2 | RowAVal3 | DiffVal2
RowAVal1 | RowAVal2 | RowBVal1 | DiffVal1
For this example, for the RowAVal1 | RowAVal2 | RowAVal3 combination, the rows with DiffVal1 would be the historically correct combination because it appears the most. I need to figure out how to count these rows.
Upvotes: 0
Views: 40
Reputation: 1269693
If I understand correctly, you want the most common value of the fourth column for combinations of the first three. This is called the mode in statistics and is easy to calculate with aggregation and window functions:
select t.*
from (select col1, col2, col3, col4, count(*) as cnt,
row_number() over (partition by col1, col2, col3 order by count(*) desc) as seqnum
from t
group by col1, col2, col3, col4
) t
where seqnum = 1;
Upvotes: 1