Reputation: 1263
In any SELECT
query, where a DISTINCT ON
is used, how can one additionally get the number of duplicates for each row in the result set?
Take e.g.
SELECT
DISTINCT ON (building)
building,
name
FROM ...
WHERE ...
This will only return the first result for each building. I want to add another column, so the results look like this:
name | building | excluded
Fred | Office | 0
Bob | Storage | 3
when there are more people than Bob in Storage. I'm using Postgres 10.
Upvotes: 1
Views: 2545
Reputation:
You can use a window function:
with data (name, building) as (
values
('Bob', 'Storage'),
('Bob', 'Storage'),
('Bob', 'Storage'),
('Bob', 'Storage'),
('Fred', 'Office'),
('Tim', 'Home'),
('Tim', 'Home')
)
select distinct on (building) *,
count(*) over (partition by building) - 1 as excluded
from data
order by building;
returns:
name | building | excluded
-----+----------+---------
Tim | Home | 1
Fred | Office | 0
Bob | Storage | 3
This works because the window function is evaluated before the distinct on ()
However this means doing some work twice. I think it might be faster to re-use the partitioning "work" to also filter out the duplicates:
with ranked as (
select *,
count(*) over w - 1 as excluded,
row_number() over w as rn
from your_table
window w as (partition by building)
)
select *
from ranked
where rn = 1;
Upvotes: 7
Reputation: 121604
You can simply use group by
instead of distinct on
(to avoid window functions):
with data (name, building) as (
values
('Bob', 'Storage'),
('Bob', 'Storage'),
('Bob', 'Storage'),
('Bob', 'Storage'),
('Fred', 'Office'),
('Tim', 'Home'),
('Tim', 'Home')
)
select min(name), building, count(*)- 1 as excluded
from data
group by building
order by building;
min | building | excluded
------+----------+----------
Tim | Home | 1
Fred | Office | 0
Bob | Storage | 3
(3 rows)
Upvotes: 1
Reputation: 22514
Use window functions?
select
first_value(name) over (partition by building order by /* your order */) first_name
first_value(building) over (partition by building order by /* your order */) building,
count(*) over (partition by building order by /* your order */) - 1 as excluded
from (
select name, building
from my_source_table
);
Upvotes: -1