oaklander114
oaklander114

Reputation: 3383

create database by load a csv files using the header as columnnames (and add a column that has the filename as a name)

I have CSV files that I want to make database tables from in mysql. I've searched all over and can't find anything on how to use the header as the column names for the table. I suppose this must be possible. In other words, when creating a new table in MySQL do you really have to define all the columns, their names, their types etc in advance. It would be great if MySQL could do something like Office Access where it converts to the corresponding type depending on how the value looks.

I know this is maybe a too broadly defined question, but any pointers in this matter would be helpful. I am learning Python too, so if it can be done through a python script that would be great too.

Thank you very much.

Upvotes: 0

Views: 3784

Answers (2)

Serge Ballesta
Serge Ballesta

Reputation: 148965

The csv module can easily give you the column names from the first line, and then the values from the other ones. The hard part will be do guess the correct column types. When you load a csv file into an Excel worksheet, you only have few types : numeric, string, date.

In a database like MySQL, you can define the size of string columns, and you can give the table a primary key and eventually other indexes. You will not be able to guess that part automatically from a csv file.

At the simplest way, you can treat all columns as varchar(255). It is really uncommon to have fields in a csv file that do not fit in 255 characters. If you want something more clever, you will have to scan the file twice : first time to control the maximum size for each colum, and at the end, you could take the minimum power of 2 greater than that. Next step would be to control if any column contains only integers or floating point values. It begins to be harder to do that automatically, because the representation of floating point values may be different depending on the locale. For example 12.51 in an english locale would be 12,51 in a french locale. But Python can give you the locale.

The hardest thing would be eventual date or datetime fields, because there are many possible formats only numeric (dd/mm/yyyy or mm/dd/yy) or using plain text (Monday, 29th of september).

My advice would be to define a default mode, for example all string, or just integer and strings, and use configuration parameters or even a configuration file to finely tune conversion per column.

For the reading part, the csv module will give you all what you need.

Upvotes: 0

Serdmanczyk
Serdmanczyk

Reputation: 1224

Using Python, you could use the csv DictReader module to makes it pretty easy to use the headers from the csv files as labels for the input data. It basically reads all lines in as a dictionary object with the keys as the headers, so you can use the keys as the source for your column names when accessing mySQL.

A quick example that reads a csv into a list of dictionaries:

example.csv:

name,address,city,state,phone
jack,111 washington st, somewhere, NE, 888-867-5309
jill,112 washington st, somewhere else, NE, 888-867-5310
john,113 washington st, another place, NE, 888-867-5311

example.py:

import csv

data = []
with open("example.csv") as csvfile:
    reader = csv.DictReader(csvfile)
    for line in reader:
        data.append(line)

print(data[0].keys())
print(data[0]['address'])
print(data[1]['name'])
print(data[2]['phone'])

output:

$:python example.py
dict_keys(['name', 'address', 'city', 'state', 'phone'])
111 washington st
jill
 888-867-5311

More in-depth examples at: http://java.dzone.com/articles/python-101-reading-and-writing

Some info on connection to MySQL in Python: How do I connect to a MySQL Database in Python?

Upvotes: 1

Related Questions