bzm3r
bzm3r

Reputation: 4596

Python/sqlite3: improving performance when executing thousands of "SELECT" queries

Here is simple use-case:

selectString = "SELECT MAX(shear_strain) FROM detailed_structure_info WHERE elementGroup = ? AND element = ? AND analysis = ?"
selectInputTuple = list(itertools.product(elementGroups, elementNumbers, analysisNumbers))

results = []
numExecutes = 0

for selectInputTuple in selectInputTuples:
    c = ei_cursor.execute(selectString, selectInputTuple)
    results.append(c.fetchone()[0])
    numExecutes += 1

    print "{} out of {} commands completed...".format(numExecutes, len(selectInputTuples))

Right now, it takes about 1 second to go through the loop each time, and one may need to loop through hundreds of times. I have thought about how I could execute as large a SELECT statement as possible, but for the moment, I think this is the smallest I can go, since I absolutely need to aggregate one particular column before moving on. So, how else could I write my SELECT statement to optimize for speed?

Upvotes: 1

Views: 114

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269753

Looping through tuples to for an aggregation seems wasteful. Just use group by and get all the results at once:

SELECT elementGroup, element, analysis, MAX(shear_strain) as max_shear_strain
FROM detailed_structure_info 
GROUP BY elementGroup, element, analysis;

If you have to run a separate query, for some reason, then create an index:

create index idx_detailed_structure_info_4
    on detailed_structure_info(elementGroup, element, analysis, shear_strain);

Upvotes: 2

Related Questions