Gaurav Vichare
Gaurav Vichare

Reputation: 1183

How can I optimize DAL web2py query?

result = db((db.company.location.belongs(locations)) &
            (db.company.emp_id.belongs(employee_ids)) &
            (db.company.type.belongs(types))).select()

locations is list of location ids

employee_ids is list of employee ids

types = ['General', 'office', 'e-commerce']

This query return 60,000 records and takes 1 minute to complete. How can I optimize it or split it?

Upvotes: 0

Views: 1016

Answers (2)

Gaurav Vichare
Gaurav Vichare

Reputation: 1183

I found myself a solution.

Company table has 20 columns. It is not specified in query that which fields to select, query returns 60,000 records each having 20 fields.

I optimized query by selecting only those column which are needed.

I needed only id and name. So I changed the query to following, now query takes only 10 seconds (previous was 60 seconds):

result = db((db.company.location.belongs(locations)) &
            (db.company.emp_id.belongs(employee_ids)) &
            (db.company.type.belongs(types))).select(db.company.id, db.company.name)

Upvotes: 0

Diogo Martins
Diogo Martins

Reputation: 937

The reason why your DAL query takes ~1 minute is because the return of the .select() method is a Rows instance, which contains 60,000 Row objects. Many instantiations of those classes consists in a time and memory consuming operation. As a solution, you could:

  • Split that query in pieces of X, using the limit by parameter: .select(limitby=(0, X)) or;
  • Use executesql db().executesql(SELECT * FROM...) to construct your query using SQL. The return would be tuples, theres isn't any Row or Rows instantiations and its much faster, but you would have the downside of not being able to enjoy the benefits of a Row object;

If any of the above solved your problem, and you have a really time consuming operation, you could try to use the split solution along with threads.

Upvotes: 2

Related Questions