Reputation: 520
I'm trying to write to a MySQL database with Pandas (v1.3.4), SQLAlchemy (v1.4.26), and PyMySQL (v1.0.2). I can create a new table (called 'test_table') using the pandas to_sql
method, but subsequent attempts to write to the same table gives:
OperationalError: (pymysql.err.OperationalError) (1050, "Table 'test_table' already exists")
I've done this previously in SQLite, so I'm not sure why it's not working in MySQL. Is this a problem with my syntax, or is there something that might need to be changed in the database server configuration?
Here's the code I'm using.
First, import and establish a connection to the database server:
from sqlalchemy import create_engine
import pymysql
import pandas as pd
sqlEngine = create_engine('mysql+pymysql://username:[email protected]', pool_recycle=3600)
con = sqlEngine.connect()
Establish the specific database name:
sql = '''
USE my_database
'''
con.execute(sql);
Generate an entry and write to a new table called test_table
:
entry = pd.DataFrame({
'PersonID': 0,
'LastName': 'smith',
'FirstName': 'joe',
}, index=[0])
entry.to_sql('test_table', con, if_exists='append')
Verify that my entry made it into the table:
sql = '''
SELECT *
FROM test_table
'''
pd.read_sql_query(sql, con)
which gives:
So far, so good. Now I try to add a new entry my test_table
table, using the if_exists='append'
argument so that the new entry will be appended to the end of my existing table:
entry = pd.DataFrame({
'PersonID': 1,
'LastName': 'smith',
'FirstName': 'mary',
}, index=[0])
entry.to_sql('test_table', con, if_exists='append')
Which results in:
OperationalError: (pymysql.err.OperationalError) (1050, "Table 'test_table' already exists")
[SQL:
CREATE TABLE test_table (
`index` BIGINT,
`PersonID` BIGINT,
`LastName` TEXT,
`FirstName` TEXT
)
]
(Background on this error at: https://sqlalche.me/e/14/e3q8)
Why is Pandas trying to create a new table here? How can I force it to append to the existing table instead?
Upvotes: 4
Views: 4652
Reputation: 1559
It could be a simple capitalization issue.
I was trying to:
df.to_sql('Table1', con=engine, index=False, if_exists='append', chunksize=batch_size)
when I needed to:
df.to_sql('TABLE1', con=engine, index=False, if_exists='append', chunksize=batch_size)
Since TABLE1
matched the table capitalization in the SQL database (Snowflake).
Upvotes: 0
Reputation: 1
I solved the problem just by using the WITH clause and sleep. This error seems to occur because the connection did not validate the table creation before the next step
sleep(2)
with con:
entry.to_sql('test_table', con, if_exists='append')
Upvotes: 0
Reputation: 91
I had the same problem and I found two ways to solve it although I lack the insight as to why this solves it:
pd.to_sql
.Doing both does not hurt.
```
#create connection to MySQL DB via sqlalchemy & pymysql
user = credentials['user']
password = credentials['password']
port = credentials['port']
host = credentials['hostname']
dialect = 'mysql'
driver = 'pymysql'
db_name = 'test_db'
# setup SQLAlchemy
from sqlalchemy import create_engine
cnx = f'{dialect}+{driver}://{user}:{password}@{host}:{port}/'
engine = create_engine(cnx)
# create database
with engine.begin() as con:
con.execute(f"CREATE DATABASE {db_name}")
############################################################
# either pass the db_name vvvv - HERE- vvvv after creating a database
cnx = f'{dialect}+{driver}://{user}:{password}@{host}:{port}/{db_name}'
############################################################
engine = create_engine(cnx)
table = 'test_table'
col = 'test_col'
with engine.begin() as con:
# this would work here instead of creating a new engine with a new link
# con.execute(f"USE {db_name}")
con.execute(f"CREATE TABLE {table} ({col} CHAR(1));")
# insert into database
import pandas as pd
df = pd.DataFrame({col : ['a','b','c']})
with engine.begin() as con:
# this has no effect here
# con.execute(f"USE {db_name}")
df.to_sql(
name= table,
if_exists='append',
con=con,
############################################################
# or pass it as a schema vvvv - HERE - vvvv
#schema=db_name,
############################################################
index=False
)```
Tested with python version 3.8.13
, sqlalchemy 1.4.32
and pandas 1.4.2
.
Same problem might have appeared here and here.
Upvotes: 6