Reputation: 97
I have a simple table, with 15 columns:
CREATE TABLE MYTABLE(
ID int(11) NOT NULL AUTO_INCREMENT,
SYMBOL varchar(100) NOT NULL,
DATE varchar(100) NOT NULL,
TIME varchar(100) NOT NULL,
NUMBER decimal(38,0) NOT NULL,
A float DEFAULT NULL,
B float DEFAULT NULL,
C float DEFAULT NULL,
D float DEFAULT NULL,
E decimal(38,0) DEFAULT NULL,
F float DEFAULT NULL,
G decimal(38,0) DEFAULT NULL,
H decimal(38,0) DEFAULT NULL,
I decimal(38,0) DEFAULT NULL,
J float DEFAULT NULL,
K float DEFAULT NULL,
L decimal(38,0) DEFAULT NULL,
M decimal(38,0) DEFAULT NULL,
MILLIS decimal(38,0) DEFAULT NULL,
PRIMARY KEY (ID)
KEY SYM (SYMBOL) USING HASH
) ENGINE=InnoDB AUTO_INCREMENT=10250241 DEFAULT CHARSET=latin1
which is indexed by symbol(hash Index). There are around 10,000,000 rows in this table(6GB) of data. When I am querying this table in workbench, for a simple query like :
select * from MYTABLE WHERE symbol = 'A' and date>= '2018-08-01' and
date<= '2018-08-09' and time>= '09:24:00' and time <= '15:24:00' order by
millis desc ;'
it is taking 4-5 seconds. The performance further decreases when the reading and writing on the database happens together. But this is a real time database, it is a requirement that the data is written from one connection and read from another.
Can someone please suggest some ways to optimize performance. I already tried a BTREE index over time, but performance further reduced.
As suggested, after running an explain on my query, I got the following result:
'Using index condition; Using where; Using filesort'
Upvotes: 1
Views: 2820
Reputation: 142433
DECIMAL(38,0)
takes 17 bytes. Do you really need that datatype? (FLOAT
takes 4, DOUBLE
takes 8, BIGINT
takes 8.) (Shrinking the 6GB will help performance some, especially if innodb_buffer_pool_size
is small.)
If millis
is for milliseconds, why is it 38 digits, not 3? Anyway, DATETIME(3)
provides a data + time + milliseconds all wrapped into about 7 bytes. Furthermore, you can do
ORDER BY datetime
and thereby allow for the efficient INDEX(symbol, datetime)
to help with the WHERE
. (This cannot be done with your current code.)
date >= '2018-08-01' and date <= '2018-08-09' and
time >= '09:24:00' and time <= '15:24:00'
filters to daytime during each of those 9 days. If that is really what you want, then no index will work well. Check the use of <=
-- I understand inclusitivity on the date, but I question it for the time.
Either INDEX(symbol, date)
or INDEX(symbol, time)
is useful. Nothing else is better (unless you can combine date+time). Add both of those.
HASH
indexes do not exist in InnoDB; that request is silently turned into BTREE
, which is about as good for 'point queries', and hugely better for 'range queries'. In your query, a 'composite' BTree query (see previous paragraph) is much better.
Please provide EXPLAIN SELECT ...
so we can further deduce what is going on.
Is everything other than ID
really optional? Consider using NOT NULL
.
'Using index condition; Using where; Using filesort'
-- "filesort" is unavoidable; live with it. "Using index condition" (aka "ICP") is good.
Upvotes: 0
Reputation: 11406
DATE varchar(100) NOT NULL,
TIME varchar(100) NOT NULL,
I would start with using DATE()
and TIME()
types instead of varchar
(or one DATETIME()
) - or store them in integer
as Unix Time
. Internally they will be more efficient than strings.
For example, comparing two integers takes around 1 CPU cycle. To compare strings, in general, every character has to be compared in a loop (until there is a difference), unless special optimizations are used. In case the data were in unicode, special lookups would have to be made for each character.
Integers also take less space (4 bytes for Unix Time) than the date/time string representation, and are not variable in length (even if the dates are all the same length, internally they will be treated as variable length strings, requiring an extra 'length field').
Also create a proper index as suggested elsewhere.
select * from MYTABLE
where symbol = 'A' and
date >= '2018-08-01' and date <= '2018-08-09' and
time >= '09:24:00' and time <= '15:24:00'
order by millis desc ;
Are you sure you want to order (only) by millis
, or is this just as a test?
For the above query, ignoring the separate sort on millis
, ideally the records would be stored on disk in this order: symbol, datetime (millis?)
. This way the records to be returned will be close together in blocks on disk. Otherwise they could be spread all over the table, requiring many disk seeks and (block)reads to retreive all the records.
Upvotes: 2
Reputation: 1270593
For this query:
select *
from MYTABLE
where symbol = 'A' and
date >= '2018-08-01' and date <= '2018-08-09' and
time >= '09:24:00' and time <= '15:24:00'
order by millis desc ;
You want an index on mytable(symbol, date, time)
. Actually time
is only there as a copy so the index covers the WHERE
clause.
Including millis
is not helpful, because of the filtering needed before the order by
.
Upvotes: 0