Reputation: 157
I'm trying to create a pandas data frame using the Snowflake Packages in python.
I run some query
sf_cur = get_sf_connector()
sf_cur.execute("USE WAREHOUSE Warehouse;")
sf_cur.execute("""select Query"""
)
print('done')
The output is roughly 21k rows. Then using
df = pd.DataFrame(sf_cur.fetchall())
takes forever, even on a limit sample of only 100 rows. Is there a way to optimize this, ideally the bigger query would be run in a loop so handling even bigger data sets would be ideal.
Upvotes: 2
Views: 7015
Reputation: 248
Use df = cur.fetch_pandas_all()
to build pandas dataframe on top of results.
Upvotes: 0
Reputation: 57
as fetchall()
copies all the result in memory, you should try to iterate over the cursor object directly and map it to a data frame inside the for block
cursor.execute(query)
for row in cursor:
#build the data frame
Other example, just to show:
query = "Select ID from Users"
cursor.execute(query)
for row in cursor:
list_ids.append(row["ID"])
Upvotes: 3