Tech Savant
Tech Savant

Reputation: 3766

MySQL query using subselects needs to use joins or exists

I've got this query that has been added to over time and even merged with other queries etc, so it has become quite a mess.

It takes WAY too long to execute now. I tried using EXPLAIN EXTENDED and add any indexes / keys I could but nothing I did helped for some reason.

I'm pretty sure the reason is all the sub-selects as mysql has to create a temporary table in memory and perform a non-indexed lookup on that table for every single row (at least, that's what I've been reading).

I've been reading up on sub-queries, joins, and using exists to try to optimize this thing, but I am just not understanding the best way to go about this.

Is there any way to use joins or use exists to replace some of the sub-queries and get this query to run faster?

The query:

SELECT 
      s.*, 
      case when m_id.make_name is not null 
                then m_id.make_name
           when m.make_name is not null 
                then m.make_name
                else s.brand end as brandname
   FROM 
      services as s 
         left join makelist as m_id 
            on cast(s.brand as unsigned) = m_id.id 
         left join makelist as m 
            on s.brand = m.make_name 
   WHERE 
          s.is_delete = 'n' 
      and UNIX_TIMESTAMP(s.`date`) >= 1420070400 
      and UNIX_TIMESTAMP(s.`date`) <= 1451563199 
      and s.service_id in ( select ticket_id
                               from messages
                               where edit_id = 0 
                                 and waiting = 1 
                                 and message_id not in ( select edit_id
                                                            from messages
                                                            where edit_id != 0 ) 
                          ) 
      or service_id in ( select ( select m3.ticket_id
                                     from messages m3 
                                     where m88.edit_id = m2.message_id ) as ticket_id 
                            from 
                               messages m88 
                            where m88.edit_id in ( select t11.edit_id
                                                      from 
                                                         ( select max(`datetime`) as newdate 
                                                             from messages
                                                             where edit_id != 0 
                                                             group by edit_id ) as t22, 
                                                         messages as t11 
                                                      where t11.`datetime` = t22.newdate 
                                                        and `waiting` = 1 ) 
                       ) 
     and s.service_id in ( select ticket_id
                              from messages
                              where edit_id = 0 
                                and warning = 1 
                                and message_id not in ( select edit_id
                                                           from messages
                                                           where edit_id != 0 )
                         ) 
      or service_id in ( select 
                               ( select m33.ticket_id
                                    from messages m33 
                                    where m888.edit_id = m22.message_id ) as ticket_id
                            from messages m888 
                            where m888.edit_id in ( select t111.edit_id 
                                                       from ( select max(`datetime`) as newdate 
                                                                 from messages
                                                                 where edit_id != 0 
                                                                 group by edit_id ) as t222, 
                                                            messages as t111 
                                                       where t111.`datetime` = t222.newdate
                                                        and `warning = 1 ) 
                       ) 
   order by 
      s.`date` desc 
   limit 
      0, 10

And ... the data sample...

table: messages

CREATE TABLE IF NOT EXISTS `messages` (
  `message_id` int(10) NOT NULL AUTO_INCREMENT,
  `employee_id` int(10) NOT NULL,
  `admin_id` int(10) NOT NULL,
  `ticket_id` int(10) NOT NULL,
  `message` text NOT NULL,
  `status` char(1) NOT NULL COMMENT 'r=read, u=unread',
  `datetime` datetime NOT NULL,
  `warning` tinyint(1) NOT NULL DEFAULT '0',
  `waiting` tinyint(1) NOT NULL DEFAULT '0',
  `edit_id` int(10) NOT NULL,
  PRIMARY KEY (`message_id`),
  KEY `message_id` (`message_id`),
  KEY `edit_id` (`edit_id`),
  KEY `ticket_id` (`ticket_id`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=197 ;



INSERT INTO `messages` (`message_id`, `employee_id`, `admin_id`, `ticket_id`, `message`, `status`, `datetime`, `warning`, `waiting`, `edit_id`) VALUES
(189, 18, 0, 4049, 'Ordered battery ', 'u', '2015-06-02 13:14:38', 0, 1, 0),
(190, 18, 0, 4069, 'Ordered Ram', 'u', '2015-06-04 09:17:57', 0, 0, 0),
(191, 18, 0, 4069, 'Ordered Ram', 'u', '2015-06-04 09:18:43', 0, 1, 0),
(192, 18, 0, 4068, 'Ordered Hard Drive', 'u', '2015-06-04 13:40:13', 0, 1, 0),
(193, 1, 0, 3712, 'customer called just now and said data was missing from last time it was here, i informed her that we keep backups for a month (not 4) and that was definitely gone, and that her screen was still going blank, and i informed her she needed to drop it by for free test. she said her daughter has it in another county and it will be a while before she can bring it in. ', 'u', '2015-06-06 09:59:27', 1, 0, 0),
(194, 18, 0, 4089, 'Ordered Keyboard ', 'u', '2015-06-09 09:51:33', 0, 1, 0),
(195, 18, 0, 4103, 'Battery  PA3817u-1BRS.... $39 or Jack $100..  customer said will bring it back next week. ', 'u', '2015-06-11 16:53:16', 0, 0, 0),
(196, 18, 0, 4105, 'Ordered Screen ', 'u', '2015-06-12 11:26:09', 0, 1, 0);

table: makelist

CREATE TABLE IF NOT EXISTS `makelist` (
  `id` int(255) NOT NULL AUTO_INCREMENT,
  `make_name` varchar(255) NOT NULL,
  `make_desc` varchar(255) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=31 ;

INSERT INTO `makelist` (`id`, `make_name`, `make_desc`) VALUES
(1, 'Acer', ''),
(2, 'Apple', ''),
(3, 'ASUS', ''),
(4, 'Compaq', ''),
(5, 'Dell', ''),
(6, 'Gateway', ''),
(7, 'HP', ''),
(8, 'IBM', ''),
(9, 'Lenovo', ''),
(10, 'Sony', ''),
(11, 'Toshiba', ''),
(27, 'Microsoft', ''),
(26, 'Printer Only', ''),
(25, 'Custom', ''),
(23, 'eMachine', ''),
(24, 'MSI', ''),
(30, 'Panasonic', ''),
(28, 'Samsung', '');

table: services

CREATE TABLE IF NOT EXISTS `services` (
  `service_id` int(10) NOT NULL AUTO_INCREMENT,
  `employee_id` int(10) NOT NULL,
  `customer_id` int(10) NOT NULL,
  `name` varchar(255) NOT NULL,
  `date` datetime NOT NULL,
  `phone` text NOT NULL,
  `alternate_phone` text NOT NULL,
  `email` varchar(50) NOT NULL,
  `brand` varchar(50) NOT NULL,
  `model` varchar(50) NOT NULL,
  `serial_tag` varchar(50) NOT NULL,
  `password` varchar(25) NOT NULL,
  `type` char(1) NOT NULL,
  `emergency` char(1) NOT NULL,
  `symptoms` varchar(100) NOT NULL,
  `left_items` text NOT NULL,
  `employee_note` text NOT NULL,
  `is_delete` char(1) NOT NULL DEFAULT 'n' COMMENT 'y=yes, n=no',
  `pickedup` tinyint(1) NOT NULL,
  `pickup_time` datetime NOT NULL,
  `how_paid` varchar(255) NOT NULL DEFAULT 'NA',
  `on_call_list` tinyint(1) NOT NULL,
  `call_list_note` mediumtext NOT NULL,
  `exclude` tinyint(1) NOT NULL DEFAULT '0',
  `paymentAmount` decimal(7,2) NOT NULL,
  `typeother` varchar(255) NOT NULL,
  `na_reason` varchar(255) NOT NULL,
  PRIMARY KEY (`service_id`),
  KEY `service_id` (`service_id`),
  KEY `employee_id` (`employee_id`),
  KEY `customer_id` (`customer_id`),
  KEY `is_delete` (`is_delete`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=4121 ;


INSERT INTO `services` (`service_id`, `employee_id`, `customer_id`, `name`, `date`, `phone`, `alternate_phone`, `email`, `brand`, `model`, `serial_tag`, `password`, `type`, `emergency`, `symptoms`, `left_items`, `employee_note`, `is_delete`, `pickedup`, `pickup_time`, `how_paid`, `on_call_list`, `call_list_note`, `exclude`, `paymentAmount`, `typeother`, `na_reason`) VALUES
(4118, 18, 0, 'custnameone', '2015-06-12 13:36:00', '(111) 111-1442', '', '', 'Other:::Packard Bell', 'MS2290', '', 'pass', 'l', '', '::diagnostics::', 'power_cord::', 'Will not turn on.. ', 'n', 0, '0000-00-00 00:00:00', 'NA', 0, '', 0, '0.00', '', ''),
(4119, 18, 0, 'custnametwo', '2015-06-12 15:51:00', '(111) 111-9390', '(111) 111-8207 cell', 'email@yahoo.com', '11', 'Satellite  L675', '', '', 'l', 'n', ':virus:::', '::', 'Clean up.. Virus\r\n', 'n', 0, '0000-00-00 00:00:00', 'NA', 0, '', 0, '0.00', '', ''),
(4120, 18, 0, 'custnamethree', '2015-06-12 17:57:00', '(111) 111-1455', '', 'email@yahoo.com', '10', 'Vaio E - Sve151D11L', '', '1234', 'l', 'n', ':virus:diagnostics::', 'power_cord::', 'Will not boot to windows ', 'n', 0, '0000-00-00 00:00:00', 'NA', 0, '', 0, '0.00', '', '');

Update, after request for more detail:

This query is listing all the records in the services table and is generated dynamically through PHP. Each record in services table can have 1 or more messages attached to it, linked through services.service_id = messages.ticket_id. When someone posts a message (or edits one) they can select to flag it as "warning" and/or "waiting". This query is pulling up the tickets who have messages with warning or waiting set to 1.

However, another layer of the onion peeled back and we come to the message edits. Message edits are stored in the same table, the difference is that a message edit has no ticket_id but instead has and edit_id which equals the message_id of the original message. So the query has to find the ticket, find the messages associated with the ticket, find out if those messages have edits, and determine which one is the most recent current version of the message and if that current version of the message is marked with a warning or waiting. Hence the messy cumbersome query.

The stuff dealing with the makelist table and brand is just there for completeness as it was tricky to get working and want to make sure whatever solution has that included. It's not really pertinent in this case, but the makelist/brand stuff is basically looking up the name of the brand based on the brand id that is stored in the services table in the brand column.

Upvotes: 0

Views: 89

Answers (1)

DRapp
DRapp

Reputation: 48179

Whatever comments you supply to this answer, I will continue with helping try revision for you, but was too much to describe in a comment to your original post.

You are looking at services within a given UNIX time range, yet your qualifier on service_id in sub-selects is looking against ALL messages. So, are you only interested in tickets that first qualify for the time range in question?

your complex WHERE clause (abbreviated...)

WHERE 
          s.is_delete = 'n' 
      and UNIX_TIMESTAMP(s.`date`) >= 1420070400 
      and UNIX_TIMESTAMP(s.`date`) <= 1451563199 
      and s.service_id in ... (sub-qualify 1)
       or service_id in ... (sub-qualify 2)
      and s.service_id in ... (sub-qualify 3)
       or service_id in ... (sub-qualify 4)

is actually running against ALL messages (per the sub-qualify instances 1-4).

It MIGHT help to do a prequery of original tickets (message_id's) ONLY within the date range first and their qualified child message edits to find the max date vs the entire.

Here is something I came up with and will try to describe and you digest.

SELECT
      s2.*,
      COALESCE( m_id.make_name, COALESCE( m.make_name, s2.brand )) as brandname
   from
      ( SELECT
              m.ticket_id,
              SUM( case when edits.edit_id = 0 then 0 else 1 end ) as NumberOfEdits,
              SUM( m.waiting + coalesce( edits.waiting, 0 ) ) as WaitingMsgs,
              SUM( m.warning + coalesce( edits.warning, 0 )) as WarningMsgs,
              SUM( m.waiting 
                 + m.warning 
                 + coalesce( edits.waiting, 0 )
                 + coalesce( edits.warning, 0 ) ) as WaitOrWarnCount,
              MAX( case when edits.waiting = 1 then edits.`datetime` else null end ) as LatestWaitingDate,
              MAX( case when edits.warning = 1 then edits.`datetime` else null end ) as LatestWarningDate,
              MAX( case when edits.waiting = 1 then edits.message_id else null end ) as LatestWaitingMsgID,
              MAX( case when edits.warning = 1 then edits.message_id else null end ) as LatestWarningMsgID
           from
              services as s 
                 LEFT JOIN messages m
                    ON s.service_id = m.ticket_id
                    LEFT JOIN messages edits
                       ON m.message_id = edits.edit_id
           WHERE 
                  s.is_delete = 'n' 
              and UNIX_TIMESTAMP(s.`date`) >= 1420070400 
              and UNIX_TIMESTAMP(s.`date`) <= 1451563199
           GROUP BY
              m.ticket_id ) PreQual

         JOIN services s2
            ON PreQual.ticket_id = s2.service_id

            LEFT JOIN makelist as m_id 
               ON CAST(s2.brand as unsigned) = m_id.id 
            LEFT JOIN makelist as m 
               ON s2.brand = m.make_name 

         LEFT JOIN messages origMsg
            ON PreQual.ticket_id = origMsg.ticket_id

         LEFT JOIN messages waitMsg
            ON PreQual.LatestWaitingMsgID = waitMsg.Message_ID

         LEFT JOIN messages warnMsg
            ON PreQual.LatestWaarningMsgID = warnMsg.Message_ID

   where 
         (     PreQual.NumberOfEdits = 0 
           AND PreQual.WaitOrWarnCount > 0 )
      OR
         waitMsg.message_id > 0
      OR
         warnMsg.message_id > 0

The first FROM source is actually a sub-select of just those service tickets within the unix date range and status you are interested in. It is LEFT-JOINED to the messages table only for that service ticket ID. That primary message is then LEFT-JOINED to itself based on ANY edits to the original message for the given service ticket.

Now, if there can be multiple messages per ticket, I am doing a simple count via SUM(case/when) if ANY edits are associated to the message. Next, I am getting a sum() based on either the ORIGINAL TICKET message OR any EDIT messages are stamped as "Waiting", so this way, I know up-front if there are ANY Waiting messages. Similarly checking for any WARNING looking at both the original message or any edits. For grins, I am also summing an overall count of ANY WAITING or WARNING associated messages per service ticket.

Next, I am getting the maximum date/time stamp for any possible edit that is associated with waiting or warning for the ticket in question.

Finally, I am getting the latest MESSAGE ID VALUE for the corresponding waiting or warning per the specific ticket if/when applicable. I am using THIS value as IT is the direct message for the service edit ticket regardless of all other service tickets.

So all this is rolled-up to a single row per originally qualified service "Ticket_ID" within the date / status you start with.

Now, the additional joins. Here, I am joining back to the services table on the qualified ticket PreQual result, so I do not have to reapply the unix date/time for the outer query portion... I already have the ticket ID. I then to LEFT JOINs to the original message, whatever the LATEST waiting message and warning message as applicable to EACH TICKET.

At last, now I can apply the overall WHERE clause, and you need to confirm or adjust as needed.

My first criteria of tickets you are interested in are those service tickets that have no pending edits that are waiting or warning, but the ORIGINAL message was at least a status of waiting or warning.

The second criteria (OR'd) is IF THERE WAS an entry with an edit status of WAITING (already qualified from the sum case/when of prequal, I already KNOW it's status was waiting)

The third criteria (OR'd) is likewise for an edit status that has a WARNING (again, from sum case/when of the prequal query).

So, if your data is a year or more, and you are only looking for tickets within the current day / week (or whatever) range, you are only considering those messages and edits, not the entire history of everything.

Again, you may need to finalize what you want, but I think I am very close...

ONE FINAL ADDITION...

Not knowing the context of the messages if whatever the last message is, supersedes all prior, this could SIGNIFICANTLY reduce your issues too, so please clarify after.

If a given product is in for service, it gets a ticket ID and default message that a person is "WAITING" for the unit to be picked up. Something happens and an edit is made to the original message, thus an EDIT entry created and warning is indicated, so now, you have the original entry of waiting, and the follow-up as warning. After the customer is contacted and the warning resolved, the product finishes its service and another edit is made and the ticket is closed, so the FINAL edit has no value for either waiting or warning. In this case, the LAST EDIT, regardless of any prior message or edit to prior message "wins" the overall status... The ticket is complete.

Similarly, if a ticket starts as WAITING, and then an edit is made for WARNING, the WARNING (now the most recent) is the primary consideration of status.

If this latest scenario better describes the work flow of service ticket operations, please confirm and I will revise the query to even further simplify.

Upvotes: 3

Related Questions