Shiyas
Shiyas

Reputation: 710

Export Postgres data to s3 with headers on each row

I'm was able to export data from Postgres to AWS S3 using this document by using the aws_commons extension.

The table columns are id and name. with this table I was able to export as csv file using below mentioned query

SELECT * from aws_s3.query_export_to_s3('select * from sample_table',
   aws_commons.create_s3_uri('rds-export-bucket', 'rds-export', 'us-east-1') ,
   options :='format csv , HEADER true'
);

with the query I'm able to generate csv file like

id,name
1,Monday
2,Tuesday
3,Wednesday

but is it possible to generate the csv file in the below mentioned format

id:1,name:Monday
id:2,name:Tuesday
id:3,name:Wednesday

I tried to create a different table with jsonb structure, and each row inserted as a json. then export had curly braces and two double quotes in it.

Sample insertion command,

CREATE TABLE unstructured_table (data JSONB NOT NULL);
INSERT INTO unstructured_table VALUES($$
  {
    "id": "1",
    "name": "test"
  }}
  $$);

after exporting from this table, I'm getting csv file like,

"{""id"": ""1"", ""name"": "test"}"

Thanks in advance

Upvotes: 2

Views: 1277

Answers (1)

Daniel Vérité
Daniel Vérité

Reputation: 61526

JSON requires double quotes around strings and CSV also requires double quotes around fields when they contain commas or double quotes.

If your goal is to produce a comma-separated list of ColumnName:ColumnValue, for all columns and rows without any kind of quoting, then this requirement is not compatible with the CSV format.

This could however be generated in SQL relatively generically, for all columns and rows of any sample_table, id being the primary key, with a query like this:

select string_agg(k||':'||v, ',')
  from sample_table t,
   lateral row_to_json(t) as r(j),
   lateral json_object_keys(j) as keys(k),
   lateral (select j->>k) as val(v)
group by id order by id;

If you feed that query to aws_s3.query_export_to_s3 with a format csv option, it will enclose each output line with double quotes. That may be close enough to your goal.

Alternatively, the text format could be used. Then the lines would not be enclosed with double quotes, but backslashes sequences might be emitted for control characters and backslashes themselves (see the text format in the COPY documentation).

Ideally the result of this query should be output verbatim into a file, not using COPY, as you would do locally in a shell with:

psql -Atc "select ..." > output-file.txt

But it doesn't seem like aws_s3.query_export_to_s3 provides an equivalent way to do that, since it's an interface on top of the COPY command.

Upvotes: 2

Related Questions