Reputation: 21
I know there are a few ways to retrieve data from RDB table. One with pandas as read_sql, the other with cursore.fetchall(). What are the main differences between both ways in terms of:
Thanks!
Upvotes: 2
Views: 6371
Reputation: 4386
That's an interesting question.
For a ~10GB SQLite database, I get the following results for your second question.
pandas.sql_query
seems comparable to speed with the cursor.fetchall
.
The rest I leave as an exercise. :D
import sqlite3
import time
import pandas as pd
def db_operation(db_name):
connection = sqlite3.connect(db_name)
c = connection.cursor()
yield c
connection.commit()
connection.close()
start = time.perf_counter()
for i in range(0, 10):
with db_operation('components.db') as c:
c.execute('''SELECT * FROM hashes WHERE (weight > 50 AND weight < 100)''')
fetchall = c.fetchall()
time.perf_counter() - start
2.967 # fractional seconds
start = time.perf_counter()
for i in range(0, 10):
connection = sqlite3.connect('components.db')
sql_query = pd.read_sql_query('''SELECT * FROM hashes WHERE (weight > 50 AND weight < 100)''', con = connection)
connection.commit()
connection.close()
time.perf_counter() - start
2.983 # fractional seconds
The difference is that cursor.fetchall()
is a bit more spartan (=plain).
pandas.read_sql_query
returns a <class 'pandas.core.frame.DataFrame'>
and so you can use all the methods of pandas.DataFrame
, like pandas.DataFrame.to_latex
, pandas.DataFrame.to_csv
pandas.DataFrame.to_excel
, etc. (documentation link)
One can accomplish the same exact goals with cursor.fetchall
, but needs to press some or a lot extra keys.
Upvotes: 3