MySQL query taking much time to load ? How to tune the database efficiently

Question

I am having a table containing currently about 5 million rows.This is a live database where data is populated as a result of a scraping script.The script is inserting the data into the table continuously, For example:

The business listing site is giving me a JSON response on API call,this is parsed and inserted into the database.A duplication check also happens in between.And on a later phase I am taking he data obtained to get reports.

While trying to take reports based on the stored information it's taking too long to complete the script execution. The scraping script is live and continues to update the table with records in the future. Every month its expected to get .7 - 1 million new records.

Following is the structure of my table,

CREATE TABLE IF NOT EXISTS `biz_listing` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `lid` smallint(11) NOT NULL,
  `name` varchar(300) NOT NULL,
  `type` enum('cat1','cat2') NOT NULL,
  `location` varchar(300) NOT NULL,
  `businessID` varchar(300) NOT NULL,
  `reviewcount` int(6) NOT NULL,
  `city` varchar(300) NOT NULL,
  `categories` varchar(300) NOT NULL,
  `result_month` varchar(10) NOT NULL,
  `updated_date` date NOT NULL,
  PRIMARY KEY (`id`),
  KEY `biz_date` (`businessID`,`updated_date`),
  KEY `type_date` (`type`,`updated_date`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8;

The records fall under two categories, 'cat1' and 'cat2' . (I am plaaning to add a new category ,say cat3)

I need to have a same station aggregate report section,which shows business IDs which fall across every month in a selected range of months.

Here it is chosen as June-July 2014.

Report on aggregate numbers # category

SELECT COUNT(t.`businessID`) AS bizcount, SUM(t.reviewcount) AS reviewcount, t.`type` 
FROM `biz_listing` t 
INNER JOIN 
( SELECT `businessID`,count(*) c FROM `biz_listing` WHERE updated_date BETWEEN '2014/06/01' AND LAST_DAY('2014/07/01') GROUP 
BY `businessID`,`type` HAVING c = 2 ) t2 
ON t2.`businessID` = t.`businessID` 
WHERE updated_date BETWEEN '2014/07/01' AND LAST_DAY('2014/07/01') GROUP BY t.`type`

EXPLAIN (done on a backup table 4 million)

enter image description here

Report on aggregate numbers # based on cities

SELECT COUNT(t.`businessID`) AS bizcount, SUM(t.reviewcount) AS reviewcount, t.`type`, t.`location` as city  
FROM `biz_listing` t 
INNER JOIN 
( SELECT `businessID`,count(*) c FROM `biz_listing` WHERE updated_date BETWEEN '2014/06/01' AND LAST_DAY('2014/07/01') GROUP 
BY `businessID`,`type` HAVING c = 2 ) t2 
ON t2.`businessID` = t.`businessID` 
WHERE updated_date BETWEEN '2014/07/01' AND LAST_DAY('2014/07/01') GROUP BY t.`location`, t.`result_month`

Here we selecting range of months (June-July) , so it will list all the businessID common in both range of months,

1st query will output according to type of Business

2nd query will output according to location

The problem is it considerably taking very long time to execute the query (600 seconds and more) also, some times the query dies before completion.

Please suggest me on optimizations for the query if you find so.

I think indexing is affecting insertion performance of the scraping script. How can I modify the current script considering insertion and retrieval performance?

Thanx in advance.

EDIT

I tried the suggested covering indexes and its taking much more time than usual :(

EXPLAIN is as follows:

enter image description here

MySQL query taking much time to load ? How to tune the database efficiently

Answers (1)

Related Questions