pfcode
pfcode

Reputation: 35

Using nested query to improve SELECT performance in MySQL

I want to store multiple MPTT (Modified Preorder Travelsal Trees) in one MySQL table with the following columns: node_id, user_id, rht, lft, value. Single tree is assigned to a single user on the website.

To select tree from specified node for a user I would use:

SELECT * FROM categories 
WHERE user_id = 123 
AND lft > node_lft 
AND rht < node_rht;

I think about using nested query for this function:

SELECT t.* FROM 
 (SELECT * FROM categories WHERE user_id = 123) t
WHERE lft > node_lft
AND rht < node_rht;

Which of the queries is faster when operating on large data (e.g 10000 users, everyone have a single tree with random depth and number of elements) and why?

Upvotes: 0

Views: 884

Answers (2)

Rick James
Rick James

Reputation: 142268

Any index starting with user_id will be beneficial for either query. Please provide SHOW CREATE TABLE.

Without the subquery:

INDEX(user_id, lft) (or INDEX(user_id, rht)) will scan perhaps half of the user_id=123 rows; a small improvement over simply INDEX(user_id). Because of the 'range' (lft > nodelft), (user_id, lft, rht), a 3-column index does get past the range on lft; hence is not beneficial.

With the subquery:

  1. Extract all the rows with user_id=123; put into a tmp table.
  2. Scan that tmp table. No index is useful.

The subquery can't be faster. If it is, then it may be that there was a difference in what was cached when you performed the timing tests.

How big is the table? How big is the cache (innodb_buffer_pool_size if InnoDB)? If the table is too huge, then 'lazy lookup' may be warranted.

Do you use the node_id for anything? Is this combo unique: (user_id, lft, rht); if so it could be the PRIMARY KEY. (Accessing via the PK us usually faster than via a secondary key in InnoDB.)

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269633

Do not use nested subqueries in the FROM clause in MySQL unless you have a really good reason. MySQL materializes such subqueries. In addition to overhead, it also prevents the use of indexes for joins.

Instead, just define the right index on the table. Based on your query:

categories(user_id, lft, rht)

Upvotes: 2

Related Questions