Arranging VARCHAR data into Columns in SQL

Question

My data is in a SQL table in the following format (I have about 20 different answers per user_id):

USER_ID     ANSWER
--------------------------
1           CAR
1           10-20 miles 
1           SALES 
2           TRAIN
2           0-10 miles
2           TEACHER

I would like to create a view in PostgreSQL where all the data is shown according to User ID

USER_ID     ANSWER1     ANSWER2        ANSWER3
-----------------------------------------------
1           CAR         10-20 miles    SALES
2           TRAIN       0-10 miles     TEACHER

Thank you!

Pavel Stehule · Accepted Answer

It is great example of not relational data (the semantic depends on row number), although the data are saved in relational database. You cannot to make relational data from not relational inside relational database. There is not any mechanism how to do this transformation safely. Theoretically, if physical order of your data is same like in your picture, then you can write queries (but you should not to use ORDER BY clause, because it can change order, because your data doesn't hold necessary data for correct ordering):

You need function:

CREATE OR REPLACE FUNCTION public.x_transformed()
 RETURNS TABLE(user_id integer, rowno integer, answer character varying)
 LANGUAGE plpgsql
AS $function$
declare r record; 
prev_user_id int;
begin
  for r in select * from x
  loop
    if user_id is distinct from r.user_id then
      rowno := 1;
    else
      rowno := rowno + 1;
    end if;
    user_id = r.user_id;
    answer = r.answer;
    return next;
  end loop;
end;
$function$

This function can add missing order information to relation. It is pretty ugly solution, but your input data are in very unhappy format, and there is not possible any clean solution (based on relational algebra):

postgres=# select * from x_transformed();
┌─────────┬───────┬─────────────┐
│ user_id │ rowno │   answer    │
╞═════════╪═══════╪═════════════╡
│       1 │     1 │ CAR         │
│       1 │     2 │ 10-20 miles │
│       1 │     3 │ SALES       │
│       2 │     1 │ TRAIN       │
│       2 │     2 │ 0-10 miles  │
│       2 │     3 │ TEACHER     │
└─────────┴───────┴─────────────┘
(6 rows)

Now, the transformation to the requested format is easy (search pivoting):

select user_id,
       max(answer) filter (where rowno = 1) as answer1,
       max(answer) filter (where rowno = 2) as answer2,
       max(answer) filter (where rowno = 3) as answer3
 from x_transformed() group by user_id;
┌─────────┬─────────┬─────────────┬─────────┐
│ user_id │ answer1 │   answer2   │ answer3 │
╞═════════╪═════════╪═════════════╪═════════╡
│       2 │ TRAIN   │ 0-10 miles  │ TEACHER │
│       1 │ CAR     │ 10-20 miles │ SALES   │
└─────────┴─────────┴─────────────┴─────────┘
(2 rows)

The main problem is in format of your input data. It can be good enough for file and application processing, but are terrible bad for storing (and processing) in relation database. The relation (or the table) is heap (not file).

With knowledge of Postgres's internals, you can ensure expected order although you use some relational operations. You can use internal implicit column ctid. ctid is an unique address of row. This will not work on other databases, and it will not work on older Postgres releases:

select user_id,
       max(answer) filter (where rowno = 1) as answer1,
       max(answer) filter (where rowno = 2) as answer2,
       max(answer) filter (where rowno = 3) as answer3
  from (select user_id,
               answer,
               row_number() over (partition by user_id
                                  order by ctid) as rowno
          from x) s
 group by user_id
 order by user_id;
┌─────────┬─────────┬─────────────┬─────────┐
│ user_id │ answer1 │   answer2   │ answer3 │
╞═════════╪═════════╪═════════════╪═════════╡
│       1 │ CAR     │ 10-20 miles │ SALES   │
│       2 │ TRAIN   │ 0-10 miles  │ TEACHER │
└─────────┴─────────┴─────────────┴─────────┘
(2 rows)

Arranging VARCHAR data into Columns in SQL

Answers (2)

Related Questions