Reputation: 1
I am trying to run a fairly complex SQL query from Python (Pandas) and am running into a question I see regularly. My SQL query is about 150 lines, and it is performing multiple tasks, accessing multiple(2) different databases, creating temp tables (6 to be exact), running updates to the tables, etc.
I don't see a way to alter my code reasonably to make this SQL script 'streamlined' and have tried to add 'NO COUNT ON', with no avail. I get the none type error.
All the examples I see are very small bits of code, has anyone solved for this for more complex queries? Aside from the length of the query, I only need to update 3 components each month - run and extract.
I am just really trying to make it a hands off process at this point.
I've been attempting to run using a stationary file location, running some quick replace methods then running the read_sql_query.
Error, and I was expecting to be able to place results into a pandas dataframe and basically copy paste to an external file.
Upvotes: -1
Views: 422
Reputation: 405
The key to optimizing anything is knowing what the parts cost. Look in the SQL profiler to understand how much each query costs. Moving data between servers can be pretty expensive. If you're bringing data from one database back to the panda code and then back to another database this round trip can also be very expensive.
The root of many database performance problems is moving data from one location/arrangement to another. Realizing when you're doing this is typically a key to speeding things up.
Upvotes: 0