brodo80
brodo80

Reputation: 83

Iterate a SQL query via PYODBC and collect the results into a Pandas DF

I am trying to run a query over and over again for all dates in a date range and collect the results into a Pandas DF for each iteration.

I established a connection (PYODBC) and created a list of dates I would like to run through the SQL query to aggregate into a DF. I confirmed that the dates are a list.

link = pyodbc.connect(  Connection Details     )

date = [d.strftime('%Y-%m-%d') for d in pd.date_range('2020-10-01','2020-10-02')]
type(date)

I created an empty DF to collect the results for each iteration of the SQL query and checked the structure.

empty = pd.DataFrame(columns = ['Date', 'Balance']) 
empty

I have the query set up as so:

sql = """
Select dt as "Date", sum(BAL)/1000 as "Balance"
From sales as bal
where bal.item IN (1,2,3,4)
AND bal.dt =  '{}'
group by "Date";

""".format(day)

I tried the following for loop in the hopes of aggregating the results of each query execution into the empty df, but I get a blank df.

for day in date:
    a = (pd.read_sql_query(sql, link))
    empty.append(a)

Any ideas if the issue is related to the SQL setup and/or for loop? A better more efficient way to tackle the issue?

Upvotes: 0

Views: 867

Answers (2)

BezginYuriy
BezginYuriy

Reputation: 29

Looks like you didn't defined the day variable when you generated sql. That may help:

def sql_gen(day):
    sql = """
    Select dt as "Date", sum(BAL)/1000 as "Balance"
    From sales as bal
    where bal.item IN (1,2,3,4)
    AND bal.dt =  '{}'
    group by "Date";

    """.format(day)
    return sql

for day in date:
    a = (pd.read_sql_query(sql_gen(day), link))
    empty.append(a)

Upvotes: 0

Parfait
Parfait

Reputation: 107587

Avoid the loop and run a single SQL query by adding Date as a GROUP BY column and pass start and end dates as parameters for filtering. And use the preferred parameterization method instead of string formatting which pandas.read_sql does support:

# PREPARED STATEMENT WITH ? PLACEHOLDERS
sql = """SALES dt AS "Date"
               , SUM(BAL)/1000 AS "Balance"
         FROM sales
         WHERE item IN (1,2,3,4)
           AND dt BETWEEN  ? AND ?
         GROUP BY dt;
      """

# BIND PARAMS TO QUERY RETURN IN SINGLE DATA FRAME
df = pd.read_sql(sql, conn, params=['2020-10-01', '2020-10-02'])

Upvotes: 1

Related Questions