David den Uyl
David den Uyl

Reputation: 53

IMPORT job disappears when using sqlalchemy-cockroachdb

I’m trying to import a climate dataset (contains geometry) to my local cockroachdb (postgis extension is installed) through the sqlalchemy-cockroachdb adapter (v1.4.3). I have a file server running using:

python -m http.server 3003 -d /path/to/file

and I’m using the following engine:

engine = create_engine(f"{DIALECT}://{USER}:{PASSWORD}@{HOST}:{PORT}/{DATABASE}")
print(engine.__repr__())

'Engine(cockroachdb://postgres:***@localhost:26257/climate)'

user postgres has admin rights on the climate database. I then run the following snippet:

    with Session(engine) as e:
        job = e.execute("import pgdump('http://localhost:3003/climate.sql') with ignore_unsupported_statements, DETACHED")
        job_id = job.fetchone()[0]
        print(job_id)

        job_status = e.execute(f"with jobs as (show jobs) select job_id, status from jobs where job_id = {job_id};")
        print(job_status.fetchall())

        sleep(1)

        print(job_status.fetchall())

790071968723927041

[(790071968723927041, 'running')]

[]

While in the first query to the db after creating the job it returns a status for my job, after the program sleeps for one second, there seems to be no trace of the job. It does not fail nor finish. No errors are thrown.

Executing the import statement from the command line using

cockroach sql -e "import pgdump('http://localhost:3003/climate.sql') with ignore_unsupported_statements, DETACHED" -d climate --user postgres --insecure

does actually work and successfully imports the data to my local db.

Does anyone have any idea whats going on? why is the import job initially created but disappears after only one second w/o throwing errors. thanks

Upvotes: 2

Views: 108

Answers (1)

Gord Thompson
Gord Thompson

Reputation: 123819

1. Why the job "disappears".

.execute() returns a SQLAlchemy CursorResult object which allows us to retrieve the results using methods like .fetchall(), .first(), etc.. When .fetchall() is called it returns a list of SQLAlchemy Row objects and the result set has been consumed. Calling .fetchall() on that same object a second time simply returns an empty list.

If you want to check the status of the job after sleep(1) then you need to call .execute(…) again.

2. Why execution seems to have no effect.

I am unable to test IMPORT specifically, but engine.echo = True shows that the construct you are using does not commit the transaction; it rolls it back:

engine.echo = True
with Session(engine) as e:
    import_stmt = "SELECT 1 AS foo"
    job = e.execute(import_stmt)
"""log output:
2022-08-23 09:28:01,754 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-08-23 09:28:01,754 INFO sqlalchemy.engine.Engine SELECT 1 AS foo
2022-08-23 09:28:01,754 INFO sqlalchemy.engine.Engine [generated in 0.00021s] {}
2022-08-23 09:28:01,757 INFO sqlalchemy.engine.Engine ROLLBACK
"""

Session.execute() (ORM) is not the best choice for this type of operation. It would be better to use Core:

engine.echo = True
with engine.begin() as conn:
    import_stmt = "SELECT 1 AS foo"
    job = conn.exec_driver_sql(import_stmt)
"""log output:
2022-08-23 09:32:13,715 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2022-08-23 09:32:13,715 INFO sqlalchemy.engine.Engine SELECT 1 AS foo
2022-08-23 09:32:13,716 INFO sqlalchemy.engine.Engine [raw sql] {}
2022-08-23 09:32:13,718 INFO sqlalchemy.engine.Engine COMMIT
"""

Upvotes: 2

Related Questions