monostop
monostop

Reputation: 569

SQL insert from two related csv files

I have a scenario where users uploads a set of csv files to my application with the tables being related with a one-to-many relationship.

Example:

File 1, persons:

ref, birthyear, gender, weight,
"L300", 1983, M, 65,
"L301", 1983, F, 78,

File 2, activities:

ref, activityStart, activityEnd,
"L300", 31Aug2014 23:00, 01Sep2014 00:00,
"L300", 31Aug2014 23:30, 02Sep2014 00:00,
"L300", 01Sep2014 07:00, 03Sep2014 00:00,
"L301", 31Aug2014 19:00, 01Sep2014 00:00,
"L301", 31Aug2014 22:30, 01Sep2014 00:00,

The primary key of both activities and persons is an autoincrement integer and the activity table has a foreign key column person_id.

Currently I'm using SQLAlchemy ORM to create the entities, but the performance is really bad when the number of records is big.

What I want to do is generate the SQL needed to insert these records when they are uploaded but I have problems figuring out how to handle the foreign keys. I cannot insert the persons first and then get the "id" by making a query on the "ref" column since the "ref" column are only unique for the current fileset and not for the whole database.

I'm currently having a sqlite database as a backend but looking into postgres for production.

I think this should be a pretty common scenario and I wonder if anyone has any good ideas on how to solve this problem?

Upvotes: 0

Views: 67

Answers (1)

David S
David S

Reputation: 13871

I'm not sure how to solve in SQLite, but in Postgres, I think you just need to do an insert with a RETURNING IDclause. From the PostgreSQL documentation on inserts:

Insert a single row into table distributors, returning the sequence number generated by the DEFAULT clause:

INSERT INTO distributors (did, dname) VALUES (DEFAULT, 'XYZ Widgets')
   RETURNING did;

9.3 SQL Insert documementation

Upvotes: 1

Related Questions