Reputation: 31
I have a python script where I am trying to loop through multiple sql queries, take the outputs of those queries as simple count(*)'s, and concatenate them into one dataframe.
def function(query):
conn = ...
sql_query = open(query, 'r').read()
df_sql_output = pd.read_sql_query(sql_query,conn)
concatenated_data = []
if not df_sql_output.empty:
concatenated_data.append(df_sql_output)
concatenated_data = pd.concat(concatenated_data, axis=1)
print(concatenated_data)
sqls = ['test1.sql','test2.sql','test3.sql']
for i in sqls:
function(i)
Upvotes: 1
Views: 2523
Reputation: 2129
This is the rough idea how can we do it. If you update your code. I can update my answer.
import pandas as pd
sqls = ['test1.sql','test2.sql','test3.sql']
def func(query):
conn = ...
df = pd.DataFrame()
if len(df) == 0:
df = pd.read_sql(query_string,conn)
else:
df_temp = df.copy()
temp = pd.read_sql(query_string,conn)
df = pd.concat([df, df_temp], ignore_index=True)
return df
for query in sqls:
func(query)
Upvotes: 0
Reputation: 8131
This is just basic Python really. Your code isn't clear because you haven't indented it properly here, but in any case it can't work because concatenated_data
is declared within the scope of function
and is lost when the function exits. There's no way for its effects to persist. You need to learn how functions work. The Python tutorial is the place to look: https://docs.python.org/3/tutorial/controlflow.html#defining-functions. In the meantime here is roughly how to fix your code worrying about wrapping anything in a function:
conn = ...
query_paths = ['test1.sql', 'test2.sql', 'test3.sql']
query_results = []
for query_path in query_paths:
with open(query_path) as file:
query = file.read()
query_results.append(pd.read_sql_query(query, conn))
result = pd.concat(query_results)
Upvotes: 1