Reputation: 129
I am trying to group by and get list of values for multiple columns. Here is an example:
City | State | Income
-------+-------+--------
Salem | OH | 40000
Salem | OH | 45000
Mason | OH | 50000
Dayton | OH | 60000
Salem | MA | 40000
Mason | MA | 45000
Mason | MA | 50000
Dayton | MA | 70000
Salem | PA | 45000
Mason | PA | 50000
Dayton | PA | 60000
The result I am looking for is:
City | States | Income
-------+------------+--------------
Salem | OH,MA,PA | 40000,45000
Mason | OH,MA,PA | 50000,45000
Dayton | OH,MA,PA | 60000,70000
I managed to get this far:
City | States | Income
-------+------------+-------------------------
Salem | OH,MA,PA | 40000,40000,45000,45000
Mason | OH,MA,PA | 50000,50000,50000,45000
Dayton | OH,MA,PA | 60000,70000,60000
How do I go from here to the result set?
City | States | Income
-------+------------+-------------------------
Salem | OH,MA,PA | 40000,45000,50000
Mason | OH,MA,PA | 50000,45000
Dayton | OH,MA,PA | 60000,70000
Upvotes: 2
Views: 7546
Reputation: 3045
Here is one more way of doing it (db fiddle):
select city,
(select string_agg(value,', ') from (select distinct value from string_split(string_agg(state, ','),',')) t) as states,
(select string_agg(value,', ') from (select distinct value from string_split(string_agg(income, ','),',')) t) as incomes
from t
group by city;
You may easily convert the splitting and merging part into a reusable scalar valued function.
Experts are welcome to comment on performance.
Upvotes: 0
Reputation: 1271051
Alas, you cannot use string_agg()
with distinct
. But you can use conditional aggregation:
select city,
string_agg(case when seqnum_state = 1 then state end, ',') as states,
string_agg(case when seqnum_income = 1 then income end, ',') as incomes
from (select t.*,
row_number() over (partition by city, state order by state) as seqnum_state,
row_number() over (partition by city, income order by income) as seqnum_income
from t
) t
group by city;
Here is a db<>fiddle.
Upvotes: 6
Reputation: 12405
You can perform separate group by on (City, state) and (City, Income) to remove duplicates, then you can separately build the (States) and (Incomes) aggregated strings and finally you can join the results in a single table:
DECLARE @tmp TABLE (City VARCHAR(100), State VARCHAR(100), Income int);
INSERT INTO @tmp
VALUES ('Salem' ,'OH', 40000) ,('Salem' ,'OH', 45000) ,('Mason' ,'OH', 50000)
,('Dayton','OH', 60000) ,('Salem' ,'MA', 40000) ,('Mason' ,'MA', 45000)
,('Mason' ,'MA', 50000) ,('Dayton','MA', 70000) ,('Salem' ,'PA', 45000)
,('Mason' ,'PA', 50000) ,('Dayton','PA', 60000)
;with States as(
select City, state
from @tmp
group by City, state
),
incomes as(
select City, Income
from @tmp
group by City, Income
)
, states_g as (
select city, STRING_AGG(state,',') as States
from states
group by city
)
, incomes_g as (
select city, STRING_AGG(Income,',') as Incomes
from incomes
group by city
)
select
s.City, s.States, i.Incomes
from
states_g as s
inner join
incomes_g as i
on i.City = s.City
Results:
Upvotes: 1