Reputation: 415
I'm having some truobles in with this query, everytime when i use it the cpu usage goes from a 5% to 67%-100%.
I'm running the mysql server in ubuntu by a java service but even if i execute the query via any mysql ide the results are the same.
I have made some search in the web about it so i'm posting the mysql's config file. I add some atributes then i had found in some post but i think i just made it worse.
Well, this is my my.cnf file:
[mysqld]
innodb_file_per_table=1
innodb_buffer_pool_size = 256M
wait_timeout = 1800
local-infile=0
open_files_limit=10192
query_cache_size=128M
join_buffer_size=128K
thread_cache_size=4
table_cache=64
key_buffer_size=128M
user = mysql
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
port = 1336
basedir = /usr
datadir = /var/lib/mysql
tmpdir = /tmp
lc-messages-dir = /usr/share/mysql
skip-external-locking
bind-address = 0.0.0.0
key_buffer = 2014M
max_allowed_packet = 2014M
thread_stack = 512K
thread_cache_size = 1024
myisam-recover = BACKUP
max_connections = 200
query_cache_limit = 2048M
log_error = /var/log/mysql/error.log
expire_logs_days = 10
max_binlog_size = 100M
[mysqldump]
quick
quote-names
max_allowed_packet = 16M
[mysql]
[isamchk]
key_buffer = 16M
!includedir /etc/mysql/conf.d/
I'm using this query:
select regPosition.deviceId, count(regPosition.speed), max(regPosition.speed) from regPosition where (TIMESTAMPDIFF(MINUTE, lastPositionTime,now()) <= '5') and regPosition.speed >= '10' group by regPosition.deviceId;
The table's type is Myisam and it has about 2M registry and has idPosition as a index. This is the create table query:
CREATE TABLE `regPosition` (
`idPosition` int(11) NOT NULL AUTO_INCREMENT COMMENT 'Id autoincremental.',
`deviceId` int(5) NOT NULL COMMENT 'Id numérico del equipo. Identificador único para cada vehículo.',
`lastPositionTime` datetime NOT NULL COMMENT 'Fecha/hora en que se registra la marca de posición (realizada por el dvr).',
`divisionew` varchar(2) DEFAULT NULL COMMENT 'Orientación Este u Oeste.',
`longitude` int(11) NOT NULL COMMENT 'longitud.',
`divisionns` varchar(2) DEFAULT NULL COMMENT 'Orientación Norte o Sur.',
`latitude` int(11) NOT NULL COMMENT 'Latitud.',
`direction` int(11) DEFAULT NULL COMMENT 'Dirección en que apunta el dispositivo.',
`gradeLon` varchar(100) DEFAULT NULL COMMENT 'Longitud transformada a grados (en decimal).',
`gradeLat` varchar(100) DEFAULT NULL COMMENT 'Latitud transformada a grados (en decimal).',
`speed` int(11) NOT NULL COMMENT 'Velocidad del vehículo. Registrada por el dvr',
PRIMARY KEY (`idPosition`),
KEY `index` (`idPosition`) USING HASH
) ENGINE=MyISAM AUTO_INCREMENT=6562682 DEFAULT CHARSET=latin1;
[EDIT]
The purpose of the query it's to get the device's id and the number of times than the speed it's bigger than 10 (that's just an example, it could be more) and get the max speed recorded by the database.
The idea of this it's the following: If the speed goes bigger than 60kmh for 5 times in a 5 minutes lapse, i need to know the device's id, max speed and the number of times that exceeded the speed limit.
If you can give me any help i would be very happy :).
Thanks for the help.
Upvotes: 0
Views: 2203
Reputation: 415
I actually found the problem with the high cpu usage.
The problem was the bad used index of the table, basically was the order of the definition when I created the indexes.
When I used the "explain" in the query I was getting that the search was done in millions of registries, even with a direct filter (even it happend with limit 100).
The reason was the indexes, so I reasiged the order of the indexes, in that way the explain shows the search was done in less than 400 registres and the cpu usage was perfect.
For example this sql:
EXPLAIN
SELECT d.deviceId, r.divisionew
FROM
device d, regPosition r
WHERE
d.enabled = 1
AND d.deviceId = r.deviceId
AND (DATE_SUB(now(),INTERVAL 8 MINUTE)) < r.lastPositionTime
ORDER BY d.deviceId DESC
This one was doing the search without using the indexes, so the query was done searching about 8 millions of registries.
With the reasignation of indexes and the order was correctly, this query was done searching in less than 400 rows.
So basically changing the order of the indexes I could get the index work and with this the high cpu usage dissapeard.
I just change the order of the indexes. I had before deviceId, lastPositionTime and probably more. With that order the index didn't work so after the change, adding the divisionew, altering the order and deleting the useless parameters the index works like a charm.
So when you had some troubles with high cpu usage with mysql you should check the indexes order because if you have indexes designed that should do the work to speed up the query and if it don't there probably be a problem with the indexes assignation.
At least that solved my problem.
Thank you all for your help.
Upvotes: 1
Reputation: 48139
Here is my approach, and it should blow through the table ONCE with no joins using MySQL variables. The premise I'm applying here is as follows. The ordered results must be that of the device ID and the time reported. The MySQL variables will track... Am I working on the same device as the last record?, AND, is the current record within 5 minutes of the last record that either started the device, or last time over the given speed limit being broken and here's why. Lets say I have the a given device reporting over a 15 minute period, starting at 8:00 am, and reported every minute for simplistic purposes. The speeds are as follows:
8:00 58 -- Start group 8:00, set max time to still consider as 8:05
8:01 60 -- speeding... within the 8:00-8:05 range. NEW end 5 minutes from now 8:06
8:02 58 -- not speeding
8:03 58 -- not speeding
8:04 59 -- not speeding
8:05 58 -- not speeding
8:06 59 -- not speeding... end of the 8:06 range, 1 over limit, ignore this
8:07 60 -- NEW cycle for device, start at 8:07 set ending time to 8:12 -- SPEEDING 1
8:08 61 -- STILL speeding max 61, extend ending time from 8:08 + 5 minutes to 8:13
8:09 62 -- STILL speeding max 62, extend ending time from 8:09 + 5 minutes to 8:14
8:10 59 -- not speeding
8:11 59 -- not speeding
8:12 60 -- SPEEDING AGAIN, within the 8:14 cutoff... reset 5 minutes from now 8:17
8:13 62 -- speeding still, extend to 8:18
8:14 64 -- speeding still, new max 64 and extend to 8:19
8:15 62 -- still speeding... total times 7 with max of 64 (so far)
if it is not considered in the above format, you would have to count all records from
8:00 to 8:05
8:01 to 8:06
8:02 to 8:07, etc.
For the sample data, you would then get speeding from
8:05 to 8:09
8:06 to 8:10
8:07 to 8:11, etc
but this way keeps looking at a rolling 5 minute cutoff from the last SPEEDING time. if for example, between 8:20-8:26 the person cuts back to LESS than 60, then a new cycle would begin at the next time speeding was encountered. Otherwise, consider how many records would be returned if a person was speeding from 8:00 to 10:00. You would have 2*60 records IF the reporting was 1 per minute... if reporting is more frequent, it would show up even more.
As the device ID changes, the "first Time" identifies the beginning of the next cycle of speeding consideration. After each record is processed, that Device ID is put into the "@lastDevice" variable for the next cycle compare. The @nextTimeCutoff will either be the same as the previous row if no speeding encountered on the current record, or ADD 5 MINUTES regardless of device... if the device is changing as it's the last for an entry, the next row will be starting a new "LimitSequence" incremented by 1 anyhow.
Now, all that said, this query will simplify items for you. If you KNOW the speed limit you are concerned about (60kph), my inner query is prequerying ONLY those records that a device is marked as speeding. If you have 2M records to go through, and MOST of them are NOT speeding, then why even BOTHER with them. So, the inner query is qualifying ONLY those that are speeding and puts them in the proper order for mysql variable processing. It will handle the rolling time period from the first time the person was speeding as described above.
select
r1.DeviceID,
@overLimitSeq := if( r1.DeviceID = @lastDevice
AND r1.lastPositionTime <= @nextTimeCutoff,
@overLimitSeq, @overLimitSeq +1 ) as LimitSequence,
min( r1.lastPositionTime ) as SpeedingTimeStart,
max( r1.lastPositionTime ) as SpeedingTimeEnd,
max( r1.speed ) as MaxSpeed,
count(*) as TimesOverLimit,
@lastDevice := r1.DeviceID as CompareForNextID,
@nextTimeCutoff := r1.lastPositionTime + interval 5 minute as NextCutoff
from
( select rp.deviceID,
rp.lastPositionTime,
rp.speed
from
regPosition rp
where
rp.speed >= 60
order by
rp.deviceID,
rp.lastPositionTime ) r1,
( select @lastDevice := 0,
@overLimitSeq := 0,
@nextTimeCutoff := now() ) sqlvars
group by
r1.DeviceID,
LimitSequence
having
TimesOverLimit > 2
order by
r1.deviceID,
r1.lastPositionTime
Upvotes: 1
Reputation: 108380
Add an index:
... ON (deviceId, lastPositionTime, speed)
and change the query. Replace this predicate:
TIMESTAMPDIFF(MINUTE, lastPositionTime,now()) <= '5'
with this:
lastPositionTime >= NOW() + INTERVAL -5 MINUTE
This will give you a covering index for the query (the EXPLAIN will show "Using index") which means the query can be satisfied from the index.
You want deviceId
column first, because MySQL can optimize the GROUP BY operation using an index with that column as the leading column (avoids a sort operation).
You want a predicate on the bare lastPositionTime
column, so MySQL can do an index range scan. When that column is buried inside a function (e.g. TIMESTAMPDIFF), MySQL has to evaluate that function for every single row. The index range scan is faster because it can eliminate the function evaluation on a whole lot of rows.
Since there's also a predicate on the speed
column, you want that in the index as well.
There are no other columns referenced, so the query can be satisfied entirely from the index, without having to visit pages in the underlying table.
Upvotes: 1