SQL insert from two related csv files

Question

I have a scenario where users uploads a set of csv files to my application with the tables being related with a one-to-many relationship.

Example:

File 1, persons:

ref, birthyear, gender, weight,
"L300", 1983, M, 65,
"L301", 1983, F, 78,

File 2, activities:

ref, activityStart, activityEnd,
"L300", 31Aug2014 23:00, 01Sep2014 00:00,
"L300", 31Aug2014 23:30, 02Sep2014 00:00,
"L300", 01Sep2014 07:00, 03Sep2014 00:00,
"L301", 31Aug2014 19:00, 01Sep2014 00:00,
"L301", 31Aug2014 22:30, 01Sep2014 00:00,

The primary key of both activities and persons is an autoincrement integer and the activity table has a foreign key column person_id.

Currently I'm using SQLAlchemy ORM to create the entities, but the performance is really bad when the number of records is big.

What I want to do is generate the SQL needed to insert these records when they are uploaded but I have problems figuring out how to handle the foreign keys. I cannot insert the persons first and then get the "id" by making a query on the "ref" column since the "ref" column are only unique for the current fileset and not for the whole database.

I'm currently having a sqlite database as a backend but looking into postgres for production.

I think this should be a pretty common scenario and I wonder if anyone has any good ideas on how to solve this problem?

SQL insert from two related csv files

Answers (1)

Related Questions