jay_ee
jay_ee

Reputation: 17

importing single .csv into mysql with python

when running this code i am getting a Error while connecting to MySQL Not all parameters were used in the SQL statement

I have tried also to ingest these with another technique

import mysql.connector as msql
from mysql.connector import Error
import pandas as pd

empdata = pd.read_csv('path_to_file', index_col=False, delimiter = ',')
empdata.head()

try:
    conn = msql.connect(host='localhost', user='test345',  
                        password='test123')
    if conn.is_connected():
        cursor = conn.cursor()
        cursor.execute("CREATE DATABASE timetheft")
        print("Database is created")
except Error as e:
    print("Error while connecting to MySQL", e)

try:
    conn = msql.connect(host='localhost', database='timetheft', user='test345', password='test123')
    if conn.is_connected():
        cursor = conn.cursor()
        cursor.execute("select database();")
        record = cursor.fetchone()
        print("You're connected to database: ", record)
        cursor.execute('DROP TABLE IF EXISTS company;')
        print('Creating table....')
        create_contracts_table = """
         CREATE TABLE company ( ID VARCHAR(40) PRIMARY KEY, 
        Company_Name VARCHAR(40), 
        Country VARCHAR(40), 
        City VARCHAR(40), 
        Email VARCHAR(40),  
        Industry VARCHAR(30),
        Employees VARCHAR(30)
        );
        """
        cursor.execute(create_company_table)
        print("Table is created....")
        for i,row in empdata.iterrows():
            sql = "INSERT INTO timetheft.company VALUES (%S, %S, %S, %S, %S,%S,%S,%S)"
            cursor.execute(sql, tuple(row))
            print("Record inserted")
            # the connection is not auto committed by default, so we must commit to save our changes
            conn.commit()
except Error as e:
            print("Error while connecting to MySQL", e)

second technique I tried

LOAD DATA LOCAL INFILE 'path_to_file'  
INTO TABLE copmany  
FIELDS TERMINATED BY ';' 
ENCLOSED BY '"'   
LINES TERMINATED BY '\n'  
IGNORE 1 LINES;

worked better but many errors. only 20% of rows ingested.

Finally here is an excerpt from the .csv (data is consistent throughout all 1K rows)

"ID";"Company_Name";"Country";"City";"Email";"Industry";"Employees" 217520699;"Enim Corp.";"Germany";"Bamberg";"[email protected]";"Internet";"51-100" 352428999;"Lacus Vestibulum Consulting";"Germany";"Villingen-Schwenningen";"[email protected]";"Food Production";"100-500" 371718299;"Dictum Ultricies Ltd";"Germany";"Anklam";"[email protected]";"Primary/Secondary Education";"100-500" 676789799;"A Consulting";"Germany";"Andernach";"[email protected]";"Government Relations";"100-500" 718526699;"Odio LLP";"Germany";"Eisenhüttenstadt";"[email protected]";"E-Learning";"11-50"

Upvotes: 0

Views: 301

Answers (1)

snakecharmerb
snakecharmerb

Reputation: 55619

I fixed these issues to get the code to work:

  • make the number of placeholders in the insert statement equal to the number of columns
  • the placeholders should be lower-case '%s'
  • the cell delimiter appears to be a semi-colon, not a comma.

For simply reading a csv with ~1000 rows Pandas is overkill (and iterrows seems not to behave as you expect). I've used the csv module from the standard library instead.

import csv 

...

sql = "INSERT INTO company VALUES (%s, %s, %s, %s, %s, %s, %s)"

with open("67359903.csv", "r", newline="") as f:
    reader = csv.reader(f, delimiter=";")
    # Skip the header row.
    next(reader)
    # For large files it may be more efficient to commit 
    # rows in batches.
    cursor.executemany(sql, reader)
    conn.commit()

If using the csv module is not convenient, the dataframe's itertuples method may be used to iterate over the data:

empdata = pd.read_csv('67359903.csv', index_col=False, delimiter=';')

for tuple_ in empdata.itertuples(index=False):
    cursor.execute(sql, tuple_)
conn.commit()

Or the dataframe can be dumped to the database directly.

import sqlalchemy as sa

engine = sa.create_engine('mysql+mysqlconnector:///test')
empdata.to_sql('company', engine, index=False, if_exists='replace')

Upvotes: 2

Related Questions