ASH
ASH

Reputation: 20342

How can we insert everything in a Pandas dataframe to a table in SQL Server?

I've used SQL Server and Python for several years, and I've used Insert Into and df.iterrows, but I have never tried to push all the contents of a data frame to a SQL Server table. I'm working with some larger data sets now, and I'd like to find an efficient way to move everything in a data frame to a table in SQL Server.

I'm testing this code.

# first I loop through a few files and append everything to a list
# this works fine

# convert the list to a data frame
df_append = DataFrame(df_append)
df_append.shape
type(df_append)

# log into DB
import pyodbc
driver= '{SQL Server Native Client 11.0}'

conn_str = (
    r'DRIVER={SQL Server};'
    r'SERVER=LAPTOP-CEDUMII6;'
    r'DATABASE=TestDB;'
    r'Trusted_Connection=yes;'
)
cnxn = pyodbc.connect(conn_str)

cursor = cnxn.cursor()
cursor.execute('SELECT * FROM FFIEC_CDR_Call_Schedule_RIBII')

for row in cursor:
    print('row = %r' % (row,))

# can log into the DB just fine...
# now I am trying to move the contents of the data frame to the table...

# Here is attempt #1...
df_append.to_sql('FFIEC_CDR_Call_Schedule_RIBII', cnxn, index=False, if_exists='replace')

# Error:
df_append.to_sql('FFIEC_CDR_Call_Schedule_RIBII', cnxn, index=False, if_exists='replace')
Traceback (most recent call last):

  File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1681, in execute
    cur.execute(*args, **kwargs)

ProgrammingError: ('42S02', "[42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Invalid object name 'sqlite_master'. (208) (SQLExecDirectW); [42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Statement(s) could not be prepared. (8180)")


The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  File "<ipython-input-87-2d90babfc8a7>", line 1, in <module>
    df_append.to_sql('FFIEC_CDR_Call_Schedule_RIBII', cnxn, index=False, if_exists='replace')

  File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\core\generic.py", line 2615, in to_sql
    method=method,

  File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 598, in to_sql
    method=method,

  File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1827, in to_sql
    table.create()

  File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 721, in create
    if self.exists():

  File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 708, in exists
    return self.pd_sql.has_table(self.name, self.schema)

  File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1838, in has_table
    return len(self.execute(query, [name]).fetchall()) > 0

  File "C:\Users\ryans\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1693, in execute
    raise ex from exc

DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': ('42S02', "[42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Invalid object name 'sqlite_master'. (208) (SQLExecDirectW); [42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Statement(s) could not be prepared. (8180)")

# Here is attempt #2...same error...
df_append.to_sql('FFIEC_CDR_Call_Schedule_RIBII', schema='dbo', con = cnxn)

I researched this before posting here, and it looks like it's doable. Something in my code must be off, what might be wrong?

Upvotes: 1

Views: 1458

Answers (1)

Gord Thompson
Gord Thompson

Reputation: 123819

pandas to_sql is certainly what you're looking for. Its documentation says that the con parameter can be a

sqlalchemy.engine.(Engine or Connection) or sqlite3.Connection

and that "Legacy support is provided for sqlite3.Connection objects.". So to_sql looks at what you've passed as con and if it is not a SQLAlchemy Connectable (Engine or Connection) then to_sql assumes that it is a sqlite3.Connection. You passed a pyodbc.Connection, which to_sql misinterprets as a sqlite3.Connection, and the resulting error is

[42S02] [Microsoft][ODBC SQL Server Driver][SQL Server]Invalid object name 'sqlite_master'.

The solution is to create a SQLAlchemy Engine object as described here and then pass that Engine object to to_sql.

p.s. For SQL Server, remember to use fast_executemany=True, e.g.,

engine = create_engine(connection_uri, fast_executemany=True)
df.to_sql(table_name, engine, …)

Upvotes: 2

Related Questions