Reputation: 25
I would like to take cells in every row and make them into a string of names... My method already deals with casing.
For example, the table;
'john' | | 'smith' | 'smith'
'john' | 'paul' | | 'smith'
'john' | 'john' | 'john' |
returns:
'john smith'
'john paul smith'
'john'
This would need to run postgreSQL 8.2.15 of postgres so I can't make use of potentially useful functions like CONCAT, and data is in a greenplum db.
Alternatively, a method to directly delete duplicate tokens in a list of strings would let me achieve the larger objective. For example:
'john smith john smith'
'john john smith'
'smith john smith'
returns
'john smith'
'john smith'
'smith john'
The order of the tokens is not important, as long as all the unique values are returned, once only.
Thanks
Upvotes: 0
Views: 3559
Reputation: 1269443
I would do this by unpivoting the data and then reaggregation:
select id, string_agg(distinct col)
from (select id, col1 from t union all
select id, col2 from t union all
select id, col3 from t union all
select id, col4 from t
) t
where col is not null
group by id;
This assumes that each row has an unique id.
You can also use a giant case
:
select concat_ws(',',
col1,
(case when col2 <> col1 then col2 end),
(case when col3 <> col2 and col3 <> col1 then col3 end),
(case when col4 <> col3 and col4 <> col2 and col4 <> col1 then col4 end)
) as newcol
from t;
In ancient versions of Postgres, you can phrase this as:
select trim(leading ',' from
(coalesce(',' || col1, '') ||
(case when col2 <> col1 then ',' || col2 else '' end) ||
(case when col3 <> col2 and col3 <> col1 then ',' || col3 else '' end),
(case when col4 <> col3 and col4 <> col2 and col4 <> col1 then ',' || col4 else '' end)
)
) as newcol
from t;
Upvotes: 0
Reputation: 14
I've come up with a solution for you! :)
The following query returns the four columns (which I named col_1,2,3and 4) and removes the duplicates by joining the test_table with itself.
Here is the code:
SELECT t1.col_1, t2.col_2, t3.col_3, t4.col_4
FROM (
SELECT id, col_1
FROM test_table
) AS t1
LEFT JOIN (
SELECT id, col_2
FROM test_table
) as t2
ON (t2.id = t1.id and t2.col_2 <> t1.col_1)
LEFT JOIN (
SELECT id, col_3
FROM test_table
) as t3
ON (t3.id = t1.id and t3.col_3 <> t1.col_1 and t3.col_3 <> t2.col_2)
LEFT JOIN (
SELECT id, col_4
FROM test_table
) as t4
ON (t4.id = t1.id and t4.col_4 <> t1.col_1 and t4.col_4 <> t2.col_2 and t4.col_4 <> t3.col_3);
If you want to obtain the final string, you just substitute the "SELECT" row with this one:
SELECT trim(both ' ' FROM (COALESCE(t1.col_1, '') || ' ' || COALESCE(t2.col_2, '') || ' ' || COALESCE(t3.col_3, '') || ' ' || COALESCE(t4.col_4, '')))
this should work with your version of postgres, according with the docs:
[for the trim and concatenation functions]
https://www.postgresql.org/docs/8.2/static/functions-string.html
//***************************************************
[for the coalesce function]
https://www.postgresql.org/docs/8.2/static/functions-conditional.html
Please let me know if I've been of help :)
P.S. Your question sounds like a bad database design: I would have those columns moved on a table in which you could do this operation by using a group by or something similar. Moreover I would do the string concatenation on a separate script. But that's my way of doing :)
Upvotes: 0
Reputation: 2459
Normalize your table structure, select distinct name values from that table, create a function to aggregate strings (see, e.g., How to concatenate strings of a string field in a PostgreSQL 'group by' query?), and apply that function. Except for the aggregate function creation, this could all be done in a single statement or view.
Upvotes: 2