Reputation: 129

Unique values with string_agg on more than 1 column

I am trying to group by and get list of values for multiple columns. Here is an example:

City   | State | Income
-------+-------+--------
Salem  |  OH   | 40000
Salem  |  OH   | 45000
Mason  |  OH   | 50000
Dayton |  OH   | 60000
Salem  |  MA   | 40000
Mason  |  MA   | 45000
Mason  |  MA   | 50000
Dayton |  MA   | 70000
Salem  |  PA   | 45000
Mason  |  PA   | 50000
Dayton |  PA   | 60000

The result I am looking for is:

City   |  States    | Income
-------+------------+--------------
Salem  | OH,MA,PA   | 40000,45000
Mason  | OH,MA,PA   | 50000,45000
Dayton | OH,MA,PA   | 60000,70000

I managed to get this far:

City   |  States    | Income
-------+------------+-------------------------
Salem  | OH,MA,PA   | 40000,40000,45000,45000
Mason  | OH,MA,PA   | 50000,50000,50000,45000
Dayton | OH,MA,PA   | 60000,70000,60000

How do I go from here to the result set?

City   |  States    | Income
-------+------------+-------------------------
Salem  | OH,MA,PA   | 40000,45000,50000
Mason  | OH,MA,PA   | 50000,45000
Dayton | OH,MA,PA   | 60000,70000

Upvotes: 2

Answers (3)

Ravi M Patel

Reputation: 3045

Here is one more way of doing it (db fiddle):

select city,
       (select string_agg(value,', ') from (select distinct value from string_split(string_agg(state, ','),',')) t) as states,
       (select string_agg(value,', ') from (select distinct value from string_split(string_agg(income, ','),',')) t) as incomes
from t
group by city;

You may easily convert the splitting and merging part into a reusable scalar valued function.

Experts are welcome to comment on performance.

Upvotes: 0

Gordon Linoff

Reputation: 1271051

Alas, you cannot use string_agg() with distinct. But you can use conditional aggregation:

select city,
       string_agg(case when seqnum_state = 1 then state end, ',') as states,
       string_agg(case when seqnum_income = 1 then income end, ',') as incomes
from (select t.*,
             row_number() over (partition by city, state order by state) as seqnum_state,
             row_number() over (partition by city, income order by income) as seqnum_income
      from t
     ) t
group by city;

Here is a db<>fiddle.

Upvotes: 6

Andrea

Reputation: 12405

You can perform separate group by on (City, state) and (City, Income) to remove duplicates, then you can separately build the (States) and (Incomes) aggregated strings and finally you can join the results in a single table:

DECLARE @tmp TABLE (City VARCHAR(100), State VARCHAR(100), Income int);
INSERT INTO @tmp
VALUES ('Salem' ,'OH', 40000)   ,('Salem' ,'OH', 45000) ,('Mason' ,'OH', 50000) 
      ,('Dayton','OH', 60000)   ,('Salem' ,'MA', 40000) ,('Mason' ,'MA', 45000)
      ,('Mason' ,'MA', 50000)   ,('Dayton','MA', 70000) ,('Salem' ,'PA', 45000) 
      ,('Mason' ,'PA', 50000)   ,('Dayton','PA', 60000)

;with States as(
    select City, state
    from @tmp
    group by  City, state
),
incomes as(
    select City, Income
    from @tmp
    group by  City, Income
)
, states_g as ( 
    select city, STRING_AGG(state,',') as States  
    from states
    group by city
)
, incomes_g as ( 
    select city, STRING_AGG(Income,',') as Incomes  
    from incomes
    group by city
)
select 
    s.City, s.States, i.Incomes 
    from 
        states_g as s
            inner join
        incomes_g as i
            on i.City = s.City

Results:

Upvotes: 1

Unique values with string_agg on more than 1 column

Answers (3)

Related Questions