user177196
user177196

Reputation: 728

Import .csv files into SQL database using SQLite in Python

I have 2 .txt files, and I converted them into .csv files using https://convertio.co/csv-xlsx/. Now, I would like to import these two .csv files into two databases using SQLite in Python (UI is Jupyter Notebook). These two .csv files are labeled person.csv and person_votes.csv. So, I did it by following the code given here (Importing a CSV file into a sqlite3 database table using Python):

import sqlite3, csv

con = sqlite3.connect(":memory:")
cur = con.cursor()
cur.execute("CREATE TABLE person (personid STR,age STR,sex STR,primary_voting_address_id STR,state_code STR,state_fips STR,county_name STR,county_fips STR,city STR,zipcode STR, zip4 STR,  PRIMARY KEY(personid))") 

with open('person.csv','r') as person_table: # `with` statement available in 2.5+
    # csv.DictReader uses first line in file for column headings by default
    dr = csv.DictReader(person_table) # comma is default delimiter
#personid   age sex primary_voting_address_id   state_code  state_fips  county_name county_fips city    zipcode zip4
    to_db = [(i['personid'], i['age'], i['sex'], i['primary_voting_address_id'], i['state_code'], i['state_flips'], i['county_name'], i['county_fips'], i['city'], i['zipcode'], i['zip4']) for i in dr]

cur.executemany("INSERT INTO t (age, sex) VALUES (?, ?);", to_db)
con.commit()

I don't understand why when I tried executing the code above, I keep getting the error message: "KeyError: 'personid'". Could someone please help?

Also, if I create another database table named to_db2 for the file person_votes.csv in the same Python file, would the following query give me all the common elements between two tables:

select ID from to_db, to_db2 WHERE to_db.ID ==  to_db2

The link to the two .csv files above is here: https://drive.google.com/open?id=0B-cyvC6eCsyCQThUeEtGcWdBbXc.

Upvotes: 1

Views: 9297

Answers (2)

Marichyasana
Marichyasana

Reputation: 3154

This works for me on Windows 10, but should work under Linux/Unix too. There are several problems:

  1. The last two rows of person.csv are not correct format, but this does not prevent the program from working. You can fix this with a text editor.
  2. person.csv uses tabs as the delimiter not commas.
  3. There is a typo (spelling) in the line that starts with "to_db ="
  4. There is a mismatch in the number of columns to import (2 instead of 11)
  5. Wrong table name on executemany.

In addition, I create the database in a file rather than in memory. It is small enough that performance should not be a problem and also any changes you make will be saved.

Here is my corrected file (you can do the other table yourself):

import sqlite3, csv

# con = sqlite3.connect(":memory:")
con = sqlite3.connect("person.db")
cur = con.cursor()
cur.execute("CREATE TABLE person (personid STR,age STR,sex STR,primary_voting_address_id STR,state_code STR,state_fips STR,county_name STR,county_fips STR,city STR,zipcode STR, zip4 STR,  PRIMARY KEY(personid))") 

with open('person.csv','r') as person_table:
    dr = csv.DictReader(person_table, delimiter='\t') # comma is default delimiter
    to_db = [(i['personid'], i['age'], i['sex'], i['primary_voting_address_id'], i['state_code'], i['state_fips'], i['county_name'], i['county_fips'], i['city'], i['zipcode'], i['zip4']) for i in dr]

cur.executemany("INSERT INTO person VALUES (?,?,?,?,?,?,?,?,?,?,?);", to_db)
con.commit()

Upvotes: 1

Ash B
Ash B

Reputation: 121

Looks like you might be missing some column names in your INSERT INTO ... statement.

Probably not great practice leaving the Primary Key as NULL too.

Upvotes: 0

Related Questions