mRiddle
mRiddle

Reputation: 214

Creating a table with many columns in PostgreSQL

In order to use COPY (in my case, from a csv file) function in PostgreSQL I need to create the destination table first.

Now, in case my table has 60 columns, for instance, it feel weird and inefficient to write manually this:

CREATE TABLE table_name(
   column1 datatype,
   column2 datatype,
   column3 datatype,
   .....
   column60 datatype

Those who use PostgreSQL - how do you ger around this issue?

Upvotes: 0

Views: 1868

Answers (1)

Nick
Nick

Reputation: 2513

I usually use file_fdw extension to read data from CSV files.

But unfortunately, file_fdw is not that convenient/flexible when you solve such tasks as reading from a CSV file with many columns. CREATE TABLE will work with any number of columns, but if it doesn't correspond to the CSV file, it will fail later, when performing SELECT. So the problem of explicit creation of table remains. However, it is possible to solve it.

Here is brute-force approach that doesn't require anything except Postgres. Written in PL/pgSQL, this function tries to create a table with one single column, and attempt to SELECT from it. If it fails, it drops the table and tries again, but with 2 columns. And so on, until SELECT is OK. All columns are of type text – this is quite a limitation, but it still solves the task of having ready-to-SELECT table instead of doing manual work.

create or replace function autocreate_table_to_read_csv(
  fdw_server text,
  csv text,
  table_name text,
  max_columns_num int default 100
) returns void as $$
declare
  i int;
  sql text;
  rec record;
begin
  execute format('drop foreign table if exists %I', table_name);
  for i in 1..max_columns_num loop
    begin
      select into sql
       format('create foreign table %I (', table_name)
          || string_agg('col' || n::text || ' text', ', ')
          || format(
            e') server %I options ( filename \'%s\', format \'csv\' );',
            fdw_server,
            csv
          )
      from generate_series(1, i) as g(n);
      raise debug 'SQL: %', sql;
      execute sql;
      execute format('select * from %I limit 1;', table_name) into rec;
      -- looks OK, so the number of columns corresponds to the first row of CSV file
      raise info 'Table % created with % column(s). SQL: %', table_name, i, sql;
      exit;
    exception when others then
      raise debug 'CSV has more than % column(s), making another attempt...', i;
    end;
  end loop;
end;
$$ language plpgsql;

Once it founds the proper number of columns, it reports about it (see raise info).

To see more details, run set client_min_messages to debug; before using the function.

Example of use:

test=# create server csv_import foreign data wrapper file_fdw;
CREATE SERVER

test=# set client_min_messages to debug;
SET

test=# select autocreate_table_to_read_csv('csv_import', '/home/nikolay/tmp/sample.csv', 'readcsv');
NOTICE:  foreign table "readcsv" does not exist, skipping
DEBUG:  SQL: create foreign table readcsv (col1 text) server csv_import options ( filename '/home/nikolay/tmp/sample.csv', format 'csv' );
DEBUG:  CSV has more than 1 column(s), making another attempt...
DEBUG:  SQL: create foreign table readcsv (col1 text, col2 text) server csv_import options ( filename '/home/nikolay/tmp/sample.csv', format 'csv' );
DEBUG:  CSV has more than 2 column(s), making another attempt...
DEBUG:  SQL: create foreign table readcsv (col1 text, col2 text, col3 text) server csv_import options ( filename '/home/nikolay/tmp/sample.csv', format 'csv' );
INFO:  Table readcsv created with 3 column(s). SQL: create foreign table readcsv (col1 text, col2 text, col3 text) server csv_import options ( filename '/home/nikolay/tmp/sample.csv', format 'csv' );
 autocreate_table_to_read_csv
------------------------------

(1 row)

test=# select * from readcsv limit 2;
 col1  | col2  | col3
-------+-------+-------
 1313  | xvcv  | 22
 fvbvb | 2434  | 4344
(2 rows)

Update: found implementation of very similar (but w/o "brute-force", requiring explicit specification of # of columns in CSV file) approach, for COPY .. FROM: How to import CSV file data into a PostgreSQL table?

P.S. Actually, this would be a really good task to improve file_fdw and COPY .. FROM capabilities of Postgres making them more flexible – for example, for postgres_fdw, there is a very handy IMPORT FOREIGN SCHEMA command, which allows very quickly define remote ("foreign") objects, just with one line – it saves a lot of efforts. Having similar thing for CSV dta would be awesome.

Upvotes: 1

Related Questions