Reputation: 1605
I gather a list of items for each item I check the database with SQL query with the following code:
SELECT *
FROM task_activity as ja
join task as j on ja.task_id = j.id
WHERE j.name = '%s'
AND ja.avg_runtime <> 0
AND ja.avg_runtime is not NULL
AND ja.id = (SELECT MAX(id) FROM task_activity
WHERE task_id = ja.task_id
and avg_runtime <> 0
AND ja.avg_runtime is not NULL)
% str(task.get('name'))).fetchall()
But do I need to iterate through the list and make a query for everyone. This list is quite large at times. Can I just make one query and get back a list data set? In this particular query I'm only looking for the column avg_runtime with the task id and the maximum id will be the last calculated runtime.
I don't have access to the database other then to make queries. Using Microsoft SQL Server 2012 (SP1) - 11.0.3349.0 (X64)
Upvotes: 0
Views: 181
Reputation: 10976
You might be able to speed this up using row_number()
. Note, I think there's a bug in your original query. Should ja.avg_runtime
in the subquery just be avg_runtime
?
sql = """with x as (
select
task_id,
avg_runtime,
id,
row_number() over (partition by ja.task_id order by ja.id desc) rn
from
task_activity as ja
join
task as j
on ja.task_id = j.id
where
j.name in ({0}) and
ja.avg_runtime <> 0 and
ja.avg_runtime is not null
) select
task_id,
avg_runtime,
id
from
x
where
rn = 1;"""
# build up ?,?,? for parameter substitution
# assume tasknames is the list containing the task names.
params = ",".join(tasknames.map(lambda x: "?"))
# connection is your db connection
cursor = connection.cursor()
# interpolate the ?,?,? and bind parameters
cursor.execute(sql.format(params), tasknames)
cursor.fetchall()
the following index should make this query pretty fast (although it depends how many rows are being excluded by the filters on ja.avg_runtime
):
create index ix_task_id_id on task_activity (task_id, id desc);
Upvotes: 1