Krishnang K Dalal
Krishnang K Dalal

Reputation: 2566

Insert Data to SQL Server Table using pymssql

I am trying to write the data frame into the SQL Server Table. My code:

conn = pymssql.connect(host="Dev02", database="DEVDb")
cur = conn.cursor()
query = "INSERT INTO dbo.SCORE_TABLE VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"
cur.executemany(query, df_sql)
conn.commit()
cur.close()
conn.close()

The dimension of the df_sql is (5860, 20) i.e. the number of columns in the data frame is same as the number of columns in the SQL Server Table. Still I am getting following error:

ValueError: more placeholders in sql than params available

UPDATED BELOW

As per one of the comments, I tried using turbodbc as below:

conn = turbodbc.connect(driver="{SQL Server}", server="Dev02", Database="DEVDb")
conn.use_async_io = True
cur = conn.cursor()
query = "INSERT INTO dbo.STG_CONTACTABILITY_SCORE VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)"
cur.executemany(query, df_sql.values)
cur.commit()
cur.close()
conn.close()

I am getting following error:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

I don't get it. What is wrong here. I see df_sql.values and I don't find anything wrong.

The first row of ndarray is as below:

[nan 'DUSTIN HOPKINS' 'SOUTHEAST MISSOURI STATE UNIVERSITY' 13.0
  '5736512217' None None 'Monday' '8:00AM' '9:00AM' 'Summer' None None None
  None '2017-12-22 10:39:30.626331' 'Completed' None '1-11KUFFZ'
  'Central Time Zone']

Upvotes: 7

Views: 33871

Answers (5)

Gabe
Gabe

Reputation: 6097

pymssql has a bulk_copy functionality now since v.2.2

The fastest way to insert data to a SQL Server table is often to use the bulk copy functions, for example:

conn = pymssql.connect(server, user, password, "tempdb")
cursor = conn.cursor()
cursor.execute("""
    CREATE TABLE example (
        col1 INT NOT NULL,
        col2 INT NOT NULL
    )
""")
cursor.close()

conn.bulk_copy("example", [(1, 2)] * 1000000)

https://pymssql.readthedocs.io/en/stable/pymssql_examples.html#bulk-copy

It has some limitations, see issues list:

Upvotes: 1

Palash Mondal
Palash Mondal

Reputation: 538

This works for me-

insert_query = """INSERT INTO dbo.temptable(CHECK_TIME, DEVICE, METRIC, VALUE, TOWER, LOCATION, ANOMALY, ANOMALY_SCORE, ANOMALY_SEVERITY)
            VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s)"""
write_data = tuple(map(tuple, data_frame.values))
cursor.executemany(insert_query, write_data)
con.commit()
cursor.close()
con.close()

Upvotes: 1

virtualdvid
virtualdvid

Reputation: 2421

I think you just need to specify each column name and don't forget the table must have the id field to charge the data frame index:

conn = pymssql.connect(host="Dev02", database="DEVDb")
cur = conn.cursor()
query = """INSERT INTO dbo.SCORE_TABLE(index, column1, column2, ..., column20)
            VALUES (?, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, 
            %s, %s, %s, %s, %s, %s)"""
cur.executemany(query, df_sql)
conn.commit()
cur.close()
conn.close()

Ok I have been using pandas and I exported the last data frame to csv like:

df.to_csv('new_file_name.csv', sep=',', encoding='utf-8')

Then I just used pyobdc and BULK INSERT Transact-SQL like:

import pyodbc

conn = pyodbc.connect(DRIVER='{SQL Server}', Server='server_name', Database='Database_name', trusted_connection='yes')
cur = conn.cursor()

cur.execute("""BULK INSERT table_name
               FROM 'C:\\Users\\folders path\\new_file_name.csv'
               WITH
               (
                   CODEPAGE = 'ACP',
                   FIRSTROW = 2,
                   FIELDTERMINATOR = ',',
                   ROWTERMINATOR = '\n'
               )""")
conn.commit()

cur.close()
conn.close()

It was a second to charge 15314 rows into SQL Server. I hope this gives you an idea.

Upvotes: 7

Parfait
Parfait

Reputation: 107767

Possibly executemany treats each row in the ndarray from your df.values call as one item since there are no comma separators between values. Hence, the placeholders outnumber actual binded values and you receive the mismatch error.

Consider converting array to a tuple of tuples (or lists of lists/tuple of lists/list of tuples) and then pass that object into executemany:

query = "INTO dbo.SCORE_TABLE VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"    
sql_data = tuple(map(tuple, df.values))

cur.executemany(query, sql_data)
cur.commit()

Upvotes: 2

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210982

If i understand correctly you want to use DataFrame.to_sql() method:

df_sql.to_sql('dbo.SCORE_TABLE', conn, index=False, if_exists='append')

Upvotes: 0

Related Questions