Reputation: 508
I have multiple sql queries I need to run (via pandas.io.sql / .read_sql) that have a very similar structure so I am attempting to parameterize them.
I am wondering if there is a way to pass in column values using .format (which works for strings).
My query (truncated to simplify this post):
sql= '''
SELECT DISTINCT
CAST(report_suite AS STRING) AS report_suite, post_pagename,
COUNT(DISTINCT(CONCAT(post_visid_high,post_visid_low))) AS unique_visitors
FROM
FOO.db
WHERE
date_time BETWEEN '{0}' AND '{1}'
AND report_suite = '{2}'
GROUP BY
report_suite, post_pagename
ORDER BY
unique_visitors DESC
'''.format(*parameters)
What I would like to do, is be able to parameterize the COUNT(DISTINCT(CONCAT(post_visid_high, post_visid_low))) as Unique Visitors
like this somehow:
COUNT(DISTINCT({3})) as {'4'}
The problem I can't seem to get around is that in order to do this would require storing the column names as something other than a string to avoid the quotes. Is there any good ways around this?
Upvotes: 1
Views: 2024
Reputation: 210832
Consider the following approach:
sql_dynamic_parms = dict(
func1='CONCAT(post_visid_high,post_visid_low)',
name1='unique_visitors'
)
sql= '''
SELECT DISTINCT
CAST(report_suite AS STRING) AS report_suite, post_pagename,
COUNT(DISTINCT({func1})) AS {name1}
FROM
FOO.db
WHERE
date_time BETWEEN %(date_from)s AND %(date_to)s
AND report_suite = %(report_suite)s
GROUP BY
report_suite, post_pagename
ORDER BY
unique_visitors DESC
'''.format(**sql_dynamic_parms)
params = dict(
date_from=pd.to_datetime('2017-01-01'),
date_to=pd.to_datetime('2017-12-01'),
report_suite=111
)
df = pd.read_sql(sql, conn, params=params)
PS you may want to read PEP-249 to see what kind of parameter placeholders are accepted
Upvotes: 2