Reputation: 580
I am trying to insert pandas Data Frame into SQL using sqlalchemy. The table is already existed in Database with three columns ID, Brand and Price. ID is identity column . How I can I check before inserting each row from pandas data frame if the Brand is already existed or not.
import pandas as pd
cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
'Price': [22000,25000,27000,35000]
}
df = pd.DataFrame(cars, columns = ['Brand', 'Price'])
from sqlalchemy import create_engine, MetaData, Table, select
from six.moves import urllib
params = urllib.parse.quote_plus("DRIVER={SQL Server};SERVER=server;DATABASE=mydb;UID=user;PWD=psw")
engine = create_engine("mssql+pyodbc:///?odbc_connect=%s" % params)
engine.connect()
# suppose df is the data-frame that we want to insert in database
df.to_sql(name='mytable',con=engine, index=False, if_exists='append')
print("inserted)
Upvotes: 2
Views: 1200
Reputation: 31226
You are really looking at a 30 year old relational database insert pattern. INSERT if it's not already in unique index (auto increment is not a meaningful unique key)
I've used mariadb but approach is same across all DBMS. Just stick to SQL92 standard.
table definition
create table car (
id double not null AUTO_INCREMENT,
brand varchar(20) not null,
price double,
primary key (id, brand),
unique key (brand)
)
python/sqlachemy to insert if it does not exist
import pandas as pd
from sqlalchemy import create_engine, MetaData, Table, select
cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
'Price': [22000,25000,27000,35000]
}
df = pd.DataFrame(cars, columns = ['Brand', 'Price'])
engine = create_engine('mysql+pymysql://sniffer:[email protected]/sniffer')
temptable = "temp"
table = "car"
key = ["Brand"]
df.to_sql(name=temptable,con=engine, index=False, if_exists='append')
conn = engine.connect()
transfersql = f"""insert into {table} ({",".join(df.columns)})
select * from {temptable} t
where not exists
(select 1 from {table} m
where {"and".join([f" t.{col} = m.{col} " for col in key])}
)"""
print(transfersql)
conn.execute(transfersql)
conn.execute(f"drop table {temptable}")
conn.close()
output (generated sql)
insert into car (Brand,Price)
select * from temp t
where not exists
(select 1 from car m
where t.Brand = m.Brand
)
Upvotes: 3