Reputation: 753
I have this SQL query
SELECT *
FROM `storages_remains`
WHERE `FREE` > 0
AND `CATALOG_ID` IN (...)
CATALOG_ID
is not unique, because table may contains multistorages data.
CATALOG_ID + STORAGE_ID
are unique, BUT I can't specify certain storage in that query, it should work for all.
FREE
is an integer value of storages remains.
What can I do to optimize this SQL query?
I have an idea to cache something like map
product_id => storage_id => is_avail:bool
is are there any SQL solutions?
From the asker's comment, note 5000 values in the IN:
Table contains about i 1 billion rows in average. Its updates everyday. I expect about 2thousand in average output. Currently when i query with 5 thousands of product its take about 300ms. In in clause will be about 1-5 thousands ids
Currently table have following indexes
All indexes was added by another programmers and i cant clearly understand how does free index works becouse FREE its just a number.
Is any way to create logical virtual column like avail = free>0 : true :false?
Currently query for example
EXPLAIN SELECT * FROM `s4y_catalog_storages_remains` WHERE `FREE`>0 AND CATALOG_ID IN('0d7bd57b-878a-11ed-96e5-001b21e8ace8','fa2f272f-ca6d-11ea-809e-a0369f4567aa','7661e585-5c7d-11e5-a9c4-a0369f4567aa')
Explain plan from sequlize row query:
[
[
{
id: 1,
select_type: 'SIMPLE',
table: 's4y_catalog_storages_remains',
partitions: null,
type: 'range',
possible_keys: 'CATALOG_ID,CATALOG_ID2,FREE,CATALOG+STORAGE+FREE',
key: 'CATALOG_ID',
key_len: '110',
ref: null,
rows: 13050,
filtered: 16.95,
Extra: 'Using index condition; Using where'
}
],
[
{
id: 1,
select_type: 'SIMPLE',
table: 's4y_catalog_storages_remains',
partitions: null,
type: 'range',
possible_keys: 'CATALOG_ID,CATALOG_ID2,FREE,CATALOG+STORAGE+FREE',
key: 'CATALOG_ID',
key_len: '110',
ref: null,
rows: 13050,
filtered: 16.95,
Extra: 'Using index condition; Using where'
}
]
]
Upvotes: 0
Views: 258
Reputation: 35171
Run showplan, then put an index on catalog_id. Run showplan and see if it's better. If not, try...
Put an index on catalog_id and free. Run showplan. If not, try...
If many rows are free = 0, put an index on catalog_id and free > 0. Run showplan. (This is basically the cache you wanted.) If not, try...
IN often results in multiple index lookups or falling back to a table scan. You have an IN with 5000 ids? Yeah, you're either doing 5000 queries or a table scan. And that's huge, IN isn't meant to be used that way, and it may actually be slowing down at the query parser. Sometimes a join is faster. Create a table of the catalog ids in your IN, then join to that table in your query. Note that you'll need the index from solution 1. Run showplan.
Upvotes: 0