Reputation: 47
I have established connection with SQL using below code and have extracted the data from SQL table, converted into dataframe and ran the predictive model. I have the output generated and want to add the values of output column alone in the database based on Unique ID column.
server = 'Servername'
database = 'DBName'
username = 'username'
password = 'password'
cnxn = pyodbc.connect('DRIVER={SQL Server};SERVER='+server+';DATABASE='+database+';UID='+username+';PWD='+ password)
sql ='SELECT * FROM TableName'
DF= pd.read_sql(sql,cnxn)
I have columns 'UniqueID','Description','Date','Predicted' in dataframe 'DF' which is retrieved from database. I have predicted the output 'Predicted' and is available in my dataframe. I need to overwrite back only the value in 'Predicted' column of the database based on UniqueID.
Please let me know if there is any way out or we can just overwrite complete dataframe to database table.
Upvotes: 1
Views: 2432
Reputation: 1511
The best method I've found is to take advantage of an SQL inner join and temporary tables to update the values. This works well if you need to update many records in SQL.
Apologies if there are any errors here as I'm borrowing this from a class I've written.
cursor = cnxn.cursor()
# reduce number of calls to server on inserts
cursor.fast_executemany = True
# insert only the key and the updated values
subset = DF[['UniqueID','Predicted']]
# form SQL insert statement
columns = ", ".join(subset.columns)
values = '('+', '.join(['?']*len(subset.columns))+')'
# insert
statement = "INSERT INTO #temp_TableName ("+columns+") VALUES "+values
insert = [tuple(x) for x in subset.values]
cursor.executemany(statement, insert)
statement = '''
UPDATE
TableName
SET
u.Predicted
FROM
TableName AS t
INNER JOIN
#temp_TableName AS u
ON
u.UniqueID=t.UnqiueID;
'''
cursor.execute(statement)
Upvotes: 1