Reputation: 214
In order to use COPY
(in my case, from a csv file) function in PostgreSQL I need to create the destination table first.
Now, in case my table has 60 columns, for instance, it feel weird and inefficient to write manually this:
CREATE TABLE table_name(
column1 datatype,
column2 datatype,
column3 datatype,
.....
column60 datatype
Those who use PostgreSQL - how do you ger around this issue?
Upvotes: 0
Views: 1868
Reputation: 2513
I usually use file_fdw
extension to read data from CSV files.
But unfortunately, file_fdw
is not that convenient/flexible when you solve such tasks as reading from a CSV file with many columns. CREATE TABLE
will work with any number of columns, but if it doesn't correspond to the CSV file, it will fail later, when performing SELECT
. So the problem of explicit creation of table remains. However, it is possible to solve it.
Here is brute-force approach that doesn't require anything except Postgres. Written in PL/pgSQL, this function tries to create a table with one single column, and attempt to SELECT
from it. If it fails, it drops the table and tries again, but with 2 columns. And so on, until SELECT
is OK. All columns are of type text
– this is quite a limitation, but it still solves the task of having ready-to-SELECT table instead of doing manual work.
create or replace function autocreate_table_to_read_csv(
fdw_server text,
csv text,
table_name text,
max_columns_num int default 100
) returns void as $$
declare
i int;
sql text;
rec record;
begin
execute format('drop foreign table if exists %I', table_name);
for i in 1..max_columns_num loop
begin
select into sql
format('create foreign table %I (', table_name)
|| string_agg('col' || n::text || ' text', ', ')
|| format(
e') server %I options ( filename \'%s\', format \'csv\' );',
fdw_server,
csv
)
from generate_series(1, i) as g(n);
raise debug 'SQL: %', sql;
execute sql;
execute format('select * from %I limit 1;', table_name) into rec;
-- looks OK, so the number of columns corresponds to the first row of CSV file
raise info 'Table % created with % column(s). SQL: %', table_name, i, sql;
exit;
exception when others then
raise debug 'CSV has more than % column(s), making another attempt...', i;
end;
end loop;
end;
$$ language plpgsql;
Once it founds the proper number of columns, it reports about it (see raise info
).
To see more details, run set client_min_messages to debug;
before using the function.
Example of use:
test=# create server csv_import foreign data wrapper file_fdw;
CREATE SERVER
test=# set client_min_messages to debug;
SET
test=# select autocreate_table_to_read_csv('csv_import', '/home/nikolay/tmp/sample.csv', 'readcsv');
NOTICE: foreign table "readcsv" does not exist, skipping
DEBUG: SQL: create foreign table readcsv (col1 text) server csv_import options ( filename '/home/nikolay/tmp/sample.csv', format 'csv' );
DEBUG: CSV has more than 1 column(s), making another attempt...
DEBUG: SQL: create foreign table readcsv (col1 text, col2 text) server csv_import options ( filename '/home/nikolay/tmp/sample.csv', format 'csv' );
DEBUG: CSV has more than 2 column(s), making another attempt...
DEBUG: SQL: create foreign table readcsv (col1 text, col2 text, col3 text) server csv_import options ( filename '/home/nikolay/tmp/sample.csv', format 'csv' );
INFO: Table readcsv created with 3 column(s). SQL: create foreign table readcsv (col1 text, col2 text, col3 text) server csv_import options ( filename '/home/nikolay/tmp/sample.csv', format 'csv' );
autocreate_table_to_read_csv
------------------------------
(1 row)
test=# select * from readcsv limit 2;
col1 | col2 | col3
-------+-------+-------
1313 | xvcv | 22
fvbvb | 2434 | 4344
(2 rows)
Update: found implementation of very similar (but w/o "brute-force", requiring explicit specification of # of columns in CSV file) approach, for COPY .. FROM
: How to import CSV file data into a PostgreSQL table?
P.S. Actually, this would be a really good task to improve file_fdw
and COPY .. FROM
capabilities of Postgres making them more flexible – for example, for postgres_fdw
, there is a very handy IMPORT FOREIGN SCHEMA
command, which allows very quickly define remote ("foreign") objects, just with one line – it saves a lot of efforts. Having similar thing for CSV dta would be awesome.
Upvotes: 1