Reputation: 860
I am trying to setup a website in django which allows the user to send queries to a database containing information about their representatives in the European Parliament. I have the data in a comma seperated .txt file with the following format:
Parliament, Name, Country, Party_Group, National_Party, Position
7, Marta Andreasen, United Kingdom, Europe of freedom and democracy Group, United Kingdom Independence Party, Member
etc....
I want to populate a SQLite3 database with this data, but so far all the tutorials I have found only show how to do this by hand. Since I have 736 observations in the file I dont really want to do this.
I suspect this is a simple matter, but I would be very grateful if someone could show me how to do this.
Thomas
Upvotes: 13
Views: 7756
Reputation: 1260
If you want to do it with a simple method using sqlite3, you can do it using these 3 steps:
$ sqlite3 db.sqlite3
sqlite> .separator ","
sqlite> .import myfile.txt table_name
However do keep the following points in mind:
.txt
file should be in the same directory as your db.sqlite3
,"/path/myfile.txt"
when importingYou can use the .tables
command to verify your table name
SQLite version 3.23.1 2018-04-10 17:39:29
Enter ".help" for usage hints.
sqlite> .tables
auth_group table_name
auth_group_permissions django_admin_log
auth_permission django_content_type
auth_user django_migrations
auth_user_groups django_session
auth_user_user_permissions
Upvotes: 0
Reputation: 2434
So assuming your models.py
looks something like this:
class Representative(models.Model):
parliament = models.CharField(max_length=128)
name = models.CharField(max_length=128)
country = models.CharField(max_length=128)
party_group = models.CharField(max_length=128)
national_party = models.CharField(max_length=128)
position = models.CharField(max_length=128)
You can then run python manage.py shell
and execute the following:
import csv
from your_app.models import Representative
# If you're using different field names, change this list accordingly.
# The order must also match the column order in the CSV file.
fields = ['parliament', 'name', 'country', 'party_group', 'national_party', 'position']
for row in csv.reader(open('your_file.csv')):
Representative.objects.create(**dict(zip(fields, row)))
And you're done.
Addendum (edit)
Per Thomas's request, here's an explanation of what **dict(zip(fields,row))
does:
So initially, fields
contains a list of field names that we defined, and row
contains a list of values that represents the current row in the CSV file.
fields = ['parliament', 'name', 'country', ...]
row = ['7', 'Marta Andreasen', 'United Kingdom', ...]
What zip()
does is it combines two lists into one list of pairs of items from both lists (like a zipper); i.e. zip(['a','b,'c'], ['A','B','C'])
will return [('a','A'), ('b','B'), ('c','C')]
. So in our case:
>>> zip(fields, row)
[('parliament', '7'), ('name', 'Marta Andreasen'), ('country', 'United Kingdom'), ...]
The dict()
function simply converts the list of pairs into a dictionary.
>>> dict(zip(fields, row))
{'parliament': '7', 'name': 'Marta Andreasen', 'country': 'United Kingdom', ...}
The **
is a way of converting a dictionary into a keyword argument list for a function. So function(**{'key': 'value'})
is the equivalent of function(key='value')
. So in out example, calling create(**dict(zip(field, row)))
is the equivalent of:
create(parliament='7', name='Marta Andreasen', country='United Kingdom', ...)
Hope this clears things up.
Upvotes: 21
Reputation: 2235
You asked what the create(**dict(zip(fields, row))) line did.
I don't know how to reply directly to your comment, so I'll try to answer it here.
zip takes multiple lists as args and returns a list of their correspond elements as tuples.
zip(list1, list2) => [(list1[0], list2[0]), (list1[1], list2[1]), .... ]
dict takes a list of 2-element tuples and returns a dictionary mapping each tuple's first element (key) to its second element (value).
create is a function that takes keyword arguments. You can use **some_dictionary to pass that dictionary into a function as keyword arguments.
create(**{'name':'john', 'age':5}) => create(name='john', age=5)
Upvotes: 2
Reputation: 169444
As SiggyF says and only slightly differently than Joschua:
Create a text file with your schema, e.g.:
CREATE TABLE politicians ( Parliament text, Name text, Country text, Party_Group text, National_Party text, Position text );
Create table:
>>> import csv, sqlite3
>>> conn = sqlite3.connect('my.db')
>>> c = conn.cursor()
>>> with open('myschema.sql') as f: # read in schema file
... schema = f.read()
...
>>> c.execute(schema) # create table per schema
<sqlite3.Cursor object at 0x1392f50>
>>> conn.commit() # commit table creation
Use csv module to read file with data to be inserted:
>>> csv_reader = csv.reader(open('myfile.txt'), skipinitialspace=True)
>>> csv_reader.next() # skip the first line in the file
['Parliament', 'Name', 'Country', ...
# put all data in a tuple
# edit: decoding from utf-8 file to unicode
>>> to_db = tuple([i.decode('utf-8') for i in line] for line in csv_reader)
>>> to_db # this will be inserted into table
[(u'7', u'Marta Andreasen', u'United Kingdom', ...
Insert data:
>>> c.executemany("INSERT INTO politicians VALUES (?,?,?,?,?,?);", to_db)
<sqlite3.Cursor object at 0x1392f50>
>>> conn.commit()
Verify that all went as expected:
>>> c.execute('SELECT * FROM politicians').fetchall()
[(u'7', u'Marta Andreasen', u'United Kingdom', ...
Edit:
And since you've decoded (to unicode) on input, you need to be sure to encode on output.
For example:
with open('encoded_output.txt', 'w') as f:
for row in c.execute('SELECT * FROM politicians').fetchall():
for col in row:
f.write(col.encode('utf-8'))
f.write('\n')
Upvotes: 4
Reputation: 6034
Something like the following should work: (not tested)
# Open database (will be created if not exists)
conn = sqlite3.connect('/path/to/your_file.db')
c = conn.cursor()
# Create table
c.execute('''create table representatives
(parliament text, name text, country text, party_group text, national_party text, position text)''')
f = open("thefile.txt")
for i in f.readlines():
# Insert a row of data
c.execute("""insert into representatives
values (?,?,?,?,?,?)""", *i.split(", ")) # *i.split(", ") does unpack the list as arguments
# Save (commit) the changes
conn.commit()
# We can also close the cursor if we are done with it
c.close()
Upvotes: 0
Reputation: 23195
You could read the data using the csv module. Then you can create an insert sql statement and use the method executemany:
cursor.executemany(sql, rows)
or use add_all if you use sqlalchemy.
Upvotes: 2