Guy Dafny
Guy Dafny

Reputation: 1829

How can i improce performance of sqlite with large table?

SQlite DB with single table and 60,000,000 records. time to run simple query is more then 100 seconds.
I've tried to switch to postgeSQL but its performance was even less good. Hadn't test it on mySQL or msSQL.
Shell I split the table (lets say different table for each pointID - there are some hundreds of it? or different table for each month - then I'll have maximum of 10,000,000 records?)

sql scheme:

CREATE TABLE `collectedData` (
    `id`    INTEGER,
    `timeStamp` double,
    `timeDateStr`   nvarchar,
    `pointID`   nvarchar,
    `pointIDindex`  double,
    `trendNumber`   integer,
    `status`    nvarchar,
    `value` double,
    PRIMARY KEY(`id`)
);

CREATE INDEX `idx_pointID` ON `collectedData` (
    `pointID`
);

CREATE INDEX `idx_pointIDindex` ON `collectedData` (
    `pointIDindex`
);

CREATE INDEX `idx_timeStamp` ON `collectedData` (
    `timeStamp`
);

CREATE INDEX `idx_trendNumber` ON `collectedData` (
    `trendNumber`
);

Next query took 107 seconds:

select * from collectedData 
where 
trendNumber =1 
and status <> ''  and 
timestamp <=1556793244 
and pointid in ('point1','point2','pont3','point4','point5','point6','point7','point8','point9','pointa') 
and pointIDindex % 1 = 0  
order by timestamp desc, id desc limit 5000

next query took 150 seconds (less conditions)

select * from collectedData 
where 
trendNumber =1 
and status <> ''  and 
timestamp <=1556793244 
and pointIDindex % 1 = 0  
order by timestamp desc, id desc limit 5000

Editing: Asnwer from another place - add the next index:

CREATE INDEX idx_All ON collectedData (trendNumber, pointid, pointIDindex, status, timestamp desc, id desc, timeDateStr, value)

had improved performance by factor of 3.

Editing #2: by @Raymond Nijland offer: the execution plan is: SEARCH TABLE collectedData USING COVERING INDEX idx_All (trendNumber=? AND pointID=?)" "0" "0" "0" "EXECUTE LIST SUBQUERY 1" "0" "0" "0" "USE TEMP B-TREE FOR ORDER BY"

and thanks to him - using this data, I've changed the order of the rules in the query to the next:

select * from (
select * from collectedData 
where 
trendNumber =1 
and status <> ''  and 
timestamp <=1556793244 
and pointid in ('point1','point2','pont3','point4','point5','point6','point7','point8','point9','pointa') 

and pointIDindex % 1 = 0  
order by id desc limit 5000
) order by timestamp desc

this made big improvement (for me it's solved).

Upvotes: 0

Views: 183

Answers (1)

Guy Dafny
Guy Dafny

Reputation: 1829

After @RaymondNijland had offered me to check the execution plan, I've changed the query to:

select * from (
select * from collectedData 
where 
trendNumber =1 
and status <> ''  and 
timestamp <=1556793244 
and pointid in ('point1','point2','pont3','point4','point5','point6','point7','point8','point9','pointa') 

and pointIDindex % 1 = 0  
order by id desc limit 5000
) order by timestamp desc

This query gives same results like the other, but is't 120 times faster (decrease the number of records before sorting).

Upvotes: 1

Related Questions