haneulkim
haneulkim

Reputation: 4928

Using datetime index in where clause MySQL

I have a table with 200 million rows where index is created in "created_at" column which is datetime datatype.

show create table [tablename] outputs:

 create table `table`
 (`created_at` datetime NOT NULL)
 PRIMARY KEY (`id`)
 KEY `created_at_index` (`created_at`)
 ENGINE=InnoDB AUTO_INCREMENT=208512112 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci'

created_at ranges from 2020-04-01 ~ 2020-05-28.

I want to get only rows that are beyond 2020-05-15 23:00:00.

when I run :

EXPLAIN SELECT created_at
          FROM table
         where created_at >= '2020-05-15 23:00:00';

it says it outputs:

rows       Extra
200mil   Using Where

My understanding is that in RDMS if no index rows are not ordered but when you create an index on a column it is in sorted order therefore right after finding '2020-05-15 23:00:00' it will simply return all rows after that.

Also since its cardinalty is 7mil I'd thought using an index would be better than full table scan.

Is it because I've inputted date as a string? but when I try

 where created_at >= date('2020-05-15 23:00:00');

still the same.

and

 where created_at >= datetime('2020-05-15 23:00:00');

outputs syntax error.

Did mysql just decide it would be more efficient to do a full table scan?

EDIT:

using equals

EXPLAIN SELECT created_at
          FROM table
         where created_at = '2020-05-15';

outputs:

key_len    ref     rows     Extra
  5        const    51 

In where clause if I change string to date('2020-05-15') it outputs:

key_len    ref     rows     Extra
  5        const    51      Using index condition

does this mean that first equal query one didn't use an index?

Upvotes: 4

Views: 4644

Answers (2)

Rick James
Rick James

Reputation: 142278

If the values are evenly distributed, about 25% of the rows are >= '2020-05-15 23:00:00' Yes, Mysql will prefer a full table scan to using the index when you have such a large percentage of the table needed.

See Why does MySQL not always use index for select query?

In a DATE context, date('2020-05-15 23:00:00') is the same as '2020-05-15'.

In a DATETIME context, datetime('2020-05-15 23:00:00') is the same as '2020-05-15 23:00:00'.

Using index means that the INDEX is "covering", which means that the entire query can be performed entirely in the index's BTree -- without reaching over to the data's BTree.

Using index condition means something quite different -- it has to do with a minor optimization relating to the two layers ("handler" and "engine") in MySQL's design. (More details in "ICP" aka "Index Condition Pushdown".)

Upvotes: 1

GMB
GMB

Reputation: 222432

All of your queries would take advantage of an index on column created_at. MySQL always uses an index when it matches the predicate(s) of the where clause.

The output of your explains do indicate that you do not have this index in place, which is confirmed by the output of your create table.

Just create the index and your database will use it.

Here is a demo:

-- sample table, without the index
create table mytable(id int, created_at datetime);

--  the query does a full scan, as no index is available
explain select created_at from mytable where created_at >= '2020-05-15 23:00:00';
id | select_type | table   | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra      
-: | :---------- | :------ | :--------- | :--- | :------------ | :--- | :------ | :--- | ---: | -------: | :----------
 1 | SIMPLE      | mytable | null       | ALL  | null          | null | null    | null |    1 |   100.00 | Using where
-- now add the index
create index idx_mytable_created_at on mytable(created_at);

-- the query uses the index
explain select created_at from mytable where created_at >= '2020-05-15 23:00:00';
id | select_type | table   | partitions | type  | possible_keys          | key                    | key_len | ref  | rows | filtered | Extra                   
-: | :---------- | :------ | :--------- | :---- | :--------------------- | :--------------------- | :------ | :--- | ---: | -------: | :-----------------------
 1 | SIMPLE      | mytable | null       | index | idx_mytable_created_at | idx_mytable_created_at | 6       | null |    1 |   100.00 | Using where; Using index

Upvotes: 1

Related Questions