Reputation: 668
I need to load data from multiple JSON files each having multiple records within them to a Postgres table. I am using the following code but it does not work (am using pgAdmin III on windows)
COPY tbl_staging_eventlog1 ("EId", "Category", "Mac", "Path", "ID")
from 'C:\\SAMPLE.JSON'
delimiter ','
;
Content of SAMPLE.JSON file is like this (giving two records out of many such):
[{"EId":"104111","Category":"(0)","Mac":"ABV","Path":"C:\\Program Files (x86)\\Google","ID":"System.Byte[]"},{"EId":"104110","Category":"(0)","Mac":"BVC","Path":"C:\\Program Files (x86)\\Google","ID":"System.Byte[]"}]
Upvotes: 13
Views: 39220
Reputation: 7310
Try this:
BEGIN;
-- let's create a temp table to bulk data into
create temporary table temp_json (values text) on commit drop;
copy temp_json from 'C:\SAMPLE.JSON';
insert into tbl_staging_eventlog1 ("EId", "Category", "Mac", "Path", "ID")
select values->>'EId' as EId,
values->>'Category' as Category,
values->>'Mac' as Mac,
values->>'Path' as Path,
values->>'ID' as ID
from (
select json_array_elements(replace(values,'\','\\')::json) as values
from temp_json
) a;
COMMIT;
Upvotes: 32
Reputation: 345
You can use spyql. Running the following command would generate INSERT statements that you can pipe into psql:
$ jq -c .[] *.json | spyql -Otable=tbl_staging_eventlog1 "SELECT json->EId, json->Category, json->Mac, json->Path, json->ID FROM json TO sql"
INSERT INTO "tbl_staging_eventlog1"("EId","Category","Mac","Path","ID") VALUES ('104111','(0)','ABV','C:\Program Files (x86)\Google','System.Byte[]'),('104110','(0)','BVC','C:\Program Files (x86)\Google','System.Byte[]');
jq is used to transform the json arrays from all json files in the current directory into json lines (1 json object per line) and then spyql takes care of converting json lines into INSERT statements.
To import the data into PostgreSQL:
$ jq -c .[] *.json | spyql -Otable=tbl_staging_eventlog1 "SELECT json->EId, json->Category, json->Mac, json->Path, json->ID FROM json TO sql" | psql -U your_user_name -h your_host your_database
Disclaimer: I am the author of spyql.
Upvotes: 0
Reputation: 5601
As mentioned in Andrew Dunstan's PostgreSQL and Technical blog
In text mode, COPY will be simply defeated by the presence of a backslash in the JSON. So, for example, any field that contains an embedded double quote mark, or an embedded newline, or anything else that needs escaping according to the JSON spec, will cause failure. And in text mode you have very little control over how it works - you can't, for example, specify a different ESCAPE character. So text mode simply won't work.
so we have to turn around to the CSV
format mode.
copy the_table(jsonfield)
from '/path/to/jsondata'
csv quote e'\x01' delimiter e'\x02';
In the official document sql-copy, some Parameters list here:
COPY table_name [ ( column_name [, ...] ) ]
FROM { 'filename' | PROGRAM 'command' | STDIN }
[ [ WITH ] ( option [, ...] ) ]
[ WHERE condition ]
where option can be one of:
FORMAT format_name
FREEZE [ boolean ]
DELIMITER 'delimiter_character'
NULL 'null_string'
HEADER [ boolean ]
QUOTE 'quote_character'
ESCAPE 'escape_character'
FORCE_QUOTE { ( column_name [, ...] ) | * }
FORCE_NOT_NULL ( column_name [, ...] )
FORCE_NULL ( column_name [, ...] )
ENCODING 'encoding_name'
Upvotes: 0