Performance challenge on MySQL RDS with simultaneous queries

Question

I have an app developed over NodeJS on AWS that has a MySQL RDS database (server class: db.r3.large - Engine: InnoDB) associated. We are having a performance problem, when we execute simultaneous queries (at the same time), the database is returning the results after finishing the last query and not after each query is finished.

So, as an example: if we execute a process that has 10 simultaneous queries of 3 seconds each, we start receiving the results at approximately 30 seconds and we want to start receiving when the first query is finished (3 seconds).

It seems that the database is receiving the queries and make a queue of them.

I'm kind of lost here since I changed several things (separate connections, pool connections, etc) of the code and the settings of AWS but doesn’t seem to improve the result.

TableA (13M records) schema:

CREATE TABLE `TableA` (
  `columnA` int(11) NOT NULL AUTO_INCREMENT,
  `columnB` varchar(20) DEFAULT NULL,
  `columnC` varchar(15) DEFAULT NULL,
  `columnD` varchar(20) DEFAULT NULL,
  `columnE` varchar(255) DEFAULT NULL,
  `columnF` varchar(255) DEFAULT NULL,
  `columnG` varchar(255) DEFAULT NULL,
  `columnH` varchar(10) DEFAULT NULL,
  `columnI` bigint(11) DEFAULT NULL,
  `columnJ` bigint(11) DEFAULT NULL,
  `columnK` varchar(5) DEFAULT NULL,
  `columnL` varchar(50) DEFAULT NULL,
  `columnM` varchar(20) DEFAULT NULL,
  `columnN` int(1) DEFAULT NULL,
  `columnO` int(1) DEFAULT '0',
  `columnP` datetime NOT NULL,
  `columnQ` datetime NOT NULL,
  PRIMARY KEY (`columnA`),
  KEY `columnB` (`columnB`),
  KEY `columnO` (`columnO`),
  KEY `columnK` (`columnK`),
  KEY `columnN` (`columnN`),
  FULLTEXT KEY `columnE` (`columnE`)
) ENGINE=InnoDB AUTO_INCREMENT=13867504 DEFAULT CHARSET=utf8;

TableB (15M records) schema:

CREATE TABLE `TableB` (
  `columnA` int(11) NOT NULL AUTO_INCREMENT,
  `columnB` varchar(50) DEFAULT NULL,
  `columnC` varchar(50) DEFAULT NULL,
  `columnD` int(1) DEFAULT NULL,
  `columnE` datetime NOT NULL,
  `columnF` datetime NOT NULL,
  PRIMARY KEY (`columnA`),
  KEY `columnB` (`columnB`),
  KEY `columnC` (`columnC`)
) ENGINE=InnoDB AUTO_INCREMENT=19153275 DEFAULT CHARSET=utf8;

Query:

SELECT COUNT(*) AS total
FROM TableA 
WHERE TableA.columnB IN (  
    SELECT TableB.columnC
    FROM TableB  
    WHERE TableB.columnB = "3764301"  
    AND TableB.columnC NOT IN (   
        SELECT field   
        FROM table   
        WHERE table.field = 10
    AND TableB.columnC NOT IN (    
        SELECT field   
        FROM table   
        WHERE table.field = 10 
    AND TableB.columnC NOT IN (    
        SELECT field   
        FROM table   
        WHERE table.field = 10
    AND TableB.columnC NOT IN (   
        SELECT field   
        FROM table   
        WHERE table.field = 10
)
AND columnM > 2;

1 execution return in 2s
10 executions return the first result in 20s and the another results after that time.

To see that queries are running I'm using "SHOW FULL PROCESSLIST" and the queries are most of the time with state "sending data".

It is not a performance issue about the query, it is a problem of recurrence over database. Even a very simple query like "SELECT COUNT(*) FROM TableA WHERE columnM = 5" has the same problem.

UPDATE

Only for testing purpose I reduce the query to only one subquery condition. Both results have 65k records.

-- USING IN
SELECT COUNT(*) as total 
FROM TableA 
WHERE TableA.columnB IN (  
    SELECT TableB.columnC  
       FROM TableB
       WHERE TableB.columnB = "103550181"  
       AND TableB.columnC NOT IN (   
           SELECT field   
           FROM tableX   
           WHERE fieldX = 15  
    )
) 
AND columnM > 2;

-- USING EXISTS
SELECT COUNT(*) as total 
FROM TableA 
WHERE EXISTS (  
    SELECT *  
    FROM TableB  
    WHERE TableB.columnB = "103550181"  
    AND TableA.columnB = TableB.columnC
    AND NOT EXISTS (   
        SELECT *
        FROM tableX
        WHERE fieldX = 15
        AND fieldY = TableB.columnC
   )
) 
AND columnM > 2;

-- Result
Query using IN : 1.7 sec
Query using EXISTS : 141 sec (:O)

Using IN or EXISTS the problem is the same, when I execute many times this query the data base have a delay and the response comes after a lot of time. Example: If one query response in 1.7 sec, if I execute 10 times this query, the first result is in 20 sec.

Performance challenge on MySQL RDS with simultaneous queries

Answers (1)

Related Questions