Reputation: 73619
we have a very big database and to ask the question in very simple terms, I am unable to decide whether should I add an index on date field or not.
My query is: should I add an index on date field for table A, which is very large and date (format: 2013-02-26 18:52:23). Following is my query:
SELECT As.id
FROM As INNER JOIN A_items ON A_items.A_id = As.id AND A_items.type IN ('BilledItem', 'CustomerItem')
WHERE (As.A_date BETWEEN '2013-01-15 18:52:23' AND '2013-01-30 18:52:23') AND A_items.category_id in ('20219') and A_items.product_id IN ('ACCDYHGYUDZNY7FZ')
now when I use explain on this it gives me following result:-
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: A_items
type: ref
possible_keys: index_A_items_on_A_id,index_A_items_on_product_id,i_type_parent_id_item_type
key: index_A_items_on_product_id
key_len: 258
ref: const
rows: 221122
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: As
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 8
ref: database.A_items.A_id
rows: 1
Extra: Using where
2 rows in set (0.00 sec)
However, when I remove A_items.product_id IN ('ACCDYHGYUDZNY7FZ') from my query and run this:-
explain
SELECT As.id
FROM As INNER JOIN A_items ON A_items.A_id = As.id AND A_items.type IN ('BilledItem', 'CustomerItem')
WHERE (As.A_date BETWEEN '2013-01-15' AND '2013-01-30') AND A_items.category_id in ('2005')
I get :-
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: As
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 15427520 <--Notice this big number
Extra: Using where
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: A_items
type: ref
possible_keys: index_A_items_on_A_id,i_type_parent_id_item_type
key: index_A_items_on_A_id
key_len: 8
ref: database.As.id
rows: 1
Extra: Using where
2 rows in set (0.00 sec)
My question is why this big number of rows are not shown in the first query and how to decide whether to add index on date field or not???
Upvotes: 1
Views: 83
Reputation: 6860
It really depends on your data. The best you can do is to try with that index. If index will help depends on the query type(s) and also in this case on the date range. If the selected range is rather small, than index should help. But for example if the table holds data only for 1 month and you select data for 20-25 days, the optimizer might still choose to ignore the index. Optimizer will always try to choose that index which will produce less rows (but it doesn't know it exactly - this is estimated by index cardinality, which can be inaccurate in certain situations).
Having extra indexes doesn't hurt SELECTs, but with each extra index, all INSERTs and UPDATEs get slower as all indexes need to updated also. So if you do relatively few inserts to this table compared to selects, then you can have this extra index even if it is not going to help for this particular query.
Upvotes: 2
Reputation: 3172
I believe that it's electing to do a table-scan, in the second instance, if that number of rows seems reasonably close to the number of rows in your table. This says to me that an index on date might help, because it would choose to pull from the index rather than scanning the entire table.
Upvotes: 1
Reputation: 11579
In the first query you have more query restrictions less table scan. That's why rows number is less. I think adding index on date field should speed up a query (at least it helped me). Why just not to try it?
Upvotes: 1