mastersom
mastersom

Reputation: 545

How to fix "invalid input syntax for type json" when inserting json str into postgresql with copy method, using python

I am trying to insert data of json type with the copy method into a postgresql db. However I am getting the error mentioned below

invalid input syntax for type json

DETAIL:  Expected end of input, but found ""aco"".
CONTEXT:  JSON data, line 1: "{""aco"...
COPY flights2016jsn04, line 1, column flight: ""{""aco"": [""AXE"", ""AXE"", ""AXE"", ""AXE""], ""dist2Org"": 984753, ""flight_att"": {""ypos"": [8..." 

I am not sure what the issue and a bit comfused on what I am reading on the internet. Below is my code with some sample data.

import json
import io
import pandas as pd
import psycopg2


dict_ = {"dist2Org": 984753, "aco": ["AXE", "AXE", "AXE", "AXE"],
     "flight_att": {"xpos": [823.08988, 6540.32231, 999, 33321],
                    "ypos": [823.08988, 6540.32231, 999, 33321], "zpos": [823.08988, 6540.32231, 999.33321]}}

json_ = json.dumps(dict_)

col_json = ["id", "flight"]
df = pd.DataFrame([65654, json_]).T

df.to_csv('test_df')
output = io.StringIO()
# ignore the index
df.to_csv(output, sep='\t', header=False, index=False)
output.getvalue()
# jump to start of stream
output.seek(0)

conn = None
try:
    # connection string
    conn_string = '{0}{1} {2}'.format("host='localhost' dbname=", 'postgres', "user='postgres' password='xxxx'")
    # conn_string = "host='localhost' dbname='postgres' user='postgres' password='xxxx'"

    # connect to the PostgreSQL database
    conn = psycopg2.connect(conn_string)

    # create a new cursor
    cur = conn.cursor()

    # load data

    cur.copy_from(output, 'flights2016jsn04', null="", columns=(col_json))

    # # commit the changes to the database
    conn.commit()

    # close communication with the database
    cur.close()

except (Exception, psycopg2.DatabaseError) as error:
    print(error)

finally:
    if conn is not None:
        conn.close()

Upvotes: 3

Views: 5708

Answers (1)

emulbreh
emulbreh

Reputation: 3461

Postgres seems to only support quoting for csv input. The \t-separated text format expects no quotes. Pandas on the other hand uses " as the default quote char, which occurs a lot in JSON and needs to be escaped.

There are two options:

  • If you know that there will be no \ts in your data you could try df.to_csv(sep='\t', quoting=csv.QUOTE_NONE) instead.
  • Use psycopg2's copy_expert to copy csv instead.

Upvotes: 4

Related Questions