slicks1
slicks1

Reputation: 349

Query still going very slowly

I have a table with 2 million records.

Here is the table

comments
---------
    +-------------+---------------+------+-----+---------+----------------+
    | Field       | Type          | Null | Key | Default | Extra          |
    +-------------+---------------+------+-----+---------+----------------+
    | commentid   | int(11)       | NO   | PRI | NULL    | auto_increment |
    | parentid    | int(11)       | YES  |     | 0       |                |
    | refno       | int(11)       | YES  |     | 0       |                |
    | createdate  | int(11)       | YES  | MUL | 0       |                |
    | remoteip    | varchar(80)   | YES  |     |         |                |
    | fingerprint | varchar(50)   | YES  |     |         |                |
    | locid       | int(11)       | YES  | MUL | 0       |                |
    | clubid      | int(11)       | YES  |     | 0       |                |
    | profileid   | int(11)       | YES  | MUL | 0       |                |
    | userid      | int(11)       | YES  | MUL | 0       |                |
    | global      | int(11)       | YES  |     | 0       |                |
    | official    | int(11)       | YES  |     | 0       |                |
    | legacyuser  | int(11)       | YES  | MUL | 0       |                |
    | mediaid     | int(11)       | YES  |     | 0       |                |
    | status      | int(11)       | YES  |     | 1       |                |
    | comment     | varchar(4000) | YES  |     |         |                |
    | likes       | int(11)       | YES  |     | 0       |                |
    | dislikes    | int(11)       | YES  |     | 0       |                |
    | import      | int(11)       | YES  |     | 0       |                |
    | author      | varchar(50)   | YES  |     |         |                |
    +-------------+---------------+------+-----+---------+----------------+

Now this query against 2 million records takes 6 to 7 seconds:

select * from comments where (locid=2085 or global=1) and status>0 order by createdate desc limit 20;

I decided to add an index to locid and it still produces results in 6 to 7 seconds

I could have used a sqlfiddle but it would have been unneccessary since the basis of this question pertains to performance and I'm not adding 2mil records to sqlfiddle.

Are there any strategies or implementations which can get this query into the 3 seconds range?

Thanks!

UPDATE

This is my explain show table.

        | comments | CREATE TABLE `comments` (
      `commentid` int(11) NOT NULL AUTO_INCREMENT,
      `parentid` int(11) DEFAULT '0',
      `refno` int(11) DEFAULT '0',
      `createdate` int(11) DEFAULT '0',
      `remoteip` varchar(80) DEFAULT '',
      `fingerprint` varchar(50) DEFAULT '',
      `locid` int(11) DEFAULT '0',
      `clubid` int(11) DEFAULT '0',
      `profileid` int(11) DEFAULT '0',
      `userid` int(11) DEFAULT '0',
      `global` int(11) DEFAULT '0',
      `official` int(11) DEFAULT '0',
      `legacyuser` int(11) DEFAULT '0',
      `mediaid` int(11) DEFAULT '0',
      `status` int(11) DEFAULT '1',
      `comment` varchar(4000) DEFAULT '',
      `likes` int(11) DEFAULT '0',
      `dislikes` int(11) DEFAULT '0',
      `import` int(11) DEFAULT '0',
      `author` varchar(50) DEFAULT '',
      PRIMARY KEY (`commentid`),
      KEY `comments_locid` (`locid`),
      KEY `comments_userid` (`userid`),
      KEY `idx_legacyusers` (`legacyuser`),
      KEY `profile_index` (`profileid`),
      KEY `comments_createdate` (`createdate`),
      KEY `compound_for_comments` (`locid`,`global`,`status`),
      KEY `global` (`global`),
      KEY `status` (`status`)
    ) ENGINE=InnoDB AUTO_INCREMENT=3848451 DEFAULT CHARSET=latin1

Upvotes: 1

Views: 57

Answers (4)

Rick James
Rick James

Reputation: 142208

This is likely to be even faster than the queries mentioned in two answer so far:

SELECT  c.*
    FROM  ( 
              ( SELECT  commentid, createdate
                    FROM  comments
                    WHERE  locid=2085
                      AND  status > 0
                    ORDER BY  createdate DESC
                    LIMIT  20 
              )
            UNION  DISTINCT 
              ( SELECT  commentid, createdate
                    FROM  comments
                    WHERE  global=1
                      AND  status > 0
                    ORDER BY  createdate DESC
                    LIMIT  20 
              )
            ORDER BY  createdate DESC
            LIMIT  20 
          ) x
    JOIN  comments c USING (commentid);

With both of these "covering" indexes:

INDEX(locid,  status, createdate, commentid)
INDEX(global, status, createdate, commentid)

(based on later info) Since (global=1) is usually true and (status>0) is usually false, the following may be better. (There is a question of whether DESC adds a monkey wrench.)

INDEX(locid,  createdate, status, commentid)
INDEX(global, createdate, status, commentid)

There is still a risk with global. If it is 'usually' 1, then the above indexes may not be optimal.

This formulation will be faster because the subqueries will be entirely in the index ("covering"), rather than lugging around all the columns (*). That does require an extra SELECT, but it is an efficient JOIN on the PRIMARY KEY of only 20 rows. If your table ever became too big to cache, this would be a big performance bonus.

I was explicit with UNION DISTINCT on the assumption that you would get dups otherwise. If not, then UNION ALL would be faster.

Schema critique:

  • Use appropriately sized INTs -- INT is 4 bytes; TINYINT is only 1; etc.
  • Use UNSIGNED where appropriate (especially, for ids and counts).
  • Use NOT NULL where appropriate.
  • Don't index flags (global? status?) by themselves; the index is very unlikely to be used.
  • How many distinct values does status hold? If status>0 can be replaced by status=1, my suggested index will work even better.

Making the data smaller may speed up this query (and others).

Upvotes: 0

Paul Spiegel
Paul Spiegel

Reputation: 31772

Try this one:

select distinct * from (

    select * from (
        select * from comments where locid=2085 and status>0 order by commentid desc limit 20
    ) t1

     union all

      select * from (
         select * from comments where global=1 and status>0 order by commentid desc limit 20
     ) t2

 ) t
order by commentid desc 
limit 20

with indexes on (locid, status) and (global, status). (status, global) might be better than (global, status) - it depends on which column is more selective.

That only works if createdate is sorted equal to commentid. Othewise you would need indexes like (locid, status, createdate) and order by createdate.

Upvotes: 0

Bohemian
Bohemian

Reputation: 424973

Most DBs, especially MySQL, are notoriously bad with or.

You can eradicate the or by splitting the query into a union with each half handling one side of the or, like this:

select * from (
    select * from comments
    where locid = 2085
    and status > 0
    union
    select * from comments
    where global = 1
    and status > 0) x
order by createdate desc
limit 20

Upvotes: 2

BlackBrain
BlackBrain

Reputation: 1019

I believe 'order by' is causing this to take so much time. Remove order by and see if it is changed . You can order by the Primary Key because the later records have bigger assigned Primary Key which is a key and faster. Other option is to use an engine that is stored on memory rather than hard disk.

Upvotes: 0

Related Questions