Matt
Matt

Reputation: 616

Grouping multiple rows into one string postgres for each position in select

I have a table education that has a column university. For each of the rows in the table I want to find 3 most similar universities from the table.

Here is my query that finds 3 most similar universities to a given input:

select distinct(university),
               similarity(unaccent(lower(university)),
                          unaccent(lower('Boston university')))
from education
order by similarity(unaccent(lower(university)),
                    unaccent(lower('Boston university'))) desc
limit 3;

It works fine. But now I would like to modify this query so that I get two columns and a row for each existing university in the table: the first column would be the university name and the second would be the three most similar universities found in the database (or if its easier - four columns where the first is the university and the next 3 are the most similar ones).

What should this statement look like?

Upvotes: 0

Views: 258

Answers (2)

GMB
GMB

Reputation: 222402

You could use an inline aggregated query:

with u as (select distinct university from education)
select 
    university,
    (
        select string_agg(u.university, ',')
        from u 
        where u.university != e.university
        order by similarity(
            unaccent(lower(u.university)), 
            unaccent(lower(e.university))
        ) desc
        limit 3
    ) similar_universities
from education e

This assumes that a given university may occur more than once in the education table (hence the need for a cte).

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269443

I think a lateral join and aggregation does what you want:

select e.university, e2.universities
from education e cross join lateral
     (select array_agg(university order by sim desc) as universities
      from (select e2.university,
                   similarity(unaccent(lower(e2.university)),
                              unaccent(lower(e.university))
                             ) as sim
            from education e2
            order by sim desc
            limit 3
           ) e2
      ) e2;

Note: The most similar university is probably the one with the same name. (You can filter that out with a where clause in the subquery.)

This returns the value as an array, because I prefer working with arrays rather than strings in Postgres.

Upvotes: 0

Related Questions