Reputation: 20342
I've used SQL Server and Python for several years, and I've used Insert Into and df.iterrows, but I have never tried to push all the contents of a data frame to a SQL Server table. I'm working with some larger data sets now, and I'd like to find an efficient way to move everything in a data frame to a table in SQL Server.
I'm testing this code.
# first I loop through a few files and append everything to a list
# this works fine
# convert the list to a data frame
df_append = DataFrame(df_append)
df_append.shape
type(df_append)
# log into DB
import pyodbc
driver= '{SQL Server Native Client 11.0}'
conn_str = (
r'DRIVER={SQL Server};'
r'SERVER=LAPTOP-CEDUMII6;'
r'DATABASE=TestDB;'
r'Trusted_Connection=yes;'
)
cnxn = pyodbc.connect(conn_str)
cursor = cnxn.cursor()
cursor.execute('SELECT * FROM FFIEC_CDR_Call_Schedule_RIBII')
for row in cursor:
print('row = %r' % (row,))
# can log into the DB just fine...
# now I am trying to move the contents of the data frame to the table...
# Here is attempt #1...
df_append.to_sql('FFIEC_CDR_Call_Schedule_RIBII', cnxn, index=False, if_exists='replace')
# Error:
df_append.to_sql('FFIEC_CDR_Call_Schedule_RIBII', cnxn, index=False, if_exists='replace')
Traceback (most recent call last):
File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1681, in execute
cur.execute(*args, **kwargs)
ProgrammingError: ('42S02', "[42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Invalid object name 'sqlite_master'. (208) (SQLExecDirectW); [42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Statement(s) could not be prepared. (8180)")
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<ipython-input-87-2d90babfc8a7>", line 1, in <module>
df_append.to_sql('FFIEC_CDR_Call_Schedule_RIBII', cnxn, index=False, if_exists='replace')
File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\core\generic.py", line 2615, in to_sql
method=method,
File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 598, in to_sql
method=method,
File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1827, in to_sql
table.create()
File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 721, in create
if self.exists():
File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 708, in exists
return self.pd_sql.has_table(self.name, self.schema)
File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1838, in has_table
return len(self.execute(query, [name]).fetchall()) > 0
File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1693, in execute
raise ex from exc
DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': ('42S02', "[42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Invalid object name 'sqlite_master'. (208) (SQLExecDirectW); [42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Statement(s) could not be prepared. (8180)")
# Here is attempt #2...same error...
df_append.to_sql('FFIEC_CDR_Call_Schedule_RIBII', schema='dbo', con = cnxn)
I researched this before posting here, and it looks like it's doable. Something in my code must be off, what might be wrong?
Upvotes: 1
Views: 1458
Reputation: 123819
pandas to_sql
is certainly what you're looking for. Its documentation says that the con
parameter can be a
sqlalchemy.engine.(Engine or Connection) or sqlite3.Connection
and that "Legacy support is provided for sqlite3.Connection objects.". So to_sql
looks at what you've passed as con
and if it is not a SQLAlchemy Connectable (Engine or Connection) then to_sql
assumes that it is a sqlite3.Connection. You passed a pyodbc.Connection, which to_sql
misinterprets as a sqlite3.Connection, and the resulting error is
[42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Invalid object name 'sqlite_master'.
The solution is to create a SQLAlchemy Engine object as described here and then pass that Engine object to to_sql
.
p.s. For SQL Server, remember to use fast_executemany=True
, e.g.,
engine = create_engine(connection_uri, fast_executemany=True)
df.to_sql(table_name, engine, …)
Upvotes: 2