8adger
8adger

Reputation: 67

mysql optimizer confusion explain plan

I am hoping someone can help decipher an explain plan and information in the optimizer trace.

I have the following query:

select doc_id,
       group_concat(distinct concat(a.event_time, id mod 10) order by id) as times,
       count(distinct id) as count_of_items   
        FROM doc_audit  a,
             rwfd1213310051e2821b404f29bc453ce7386f764e b 
        WHERE b.rptid = a.doc_id 
          AND  a.property = 'status'  
        group by 1;

And get this execution plan.

*************************** 1. row ***************************
           id: 1                                              
  select_type: SIMPLE                                         
        table: b                                              
   partitions: NULL                                           
         type: index                                          
possible_keys: idx_tmp_rptid                                  
          key: idx_tmp_rptid                                  
      key_len: 8                                              
          ref: NULL                                           
         rows: 7153                                           
     filtered: 100.00                                         
        Extra: Using index; Using temporary; Using filesort   
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: a
   partitions: NULL
         type: ref                                            
possible_keys: DOC_ID_PROPERTY_IDX,event_time,event_time2
          key: DOC_ID_PROPERTY_IDX
      key_len: 776
          ref: database.b.rptid,const
         rows: 2
     filtered: 43.75
        Extra: Using where

My limited understanding is the order in the explain plan is the order the plan is executed. If this is the case I don't understand where the "Using filsort" comes from on table b? If I enable the optimzer profile trace it suggests the filsort is related to table a. I reached this conclusion because doc_id is in table a. Furthermore I believe filsort simply means "Sort"? doc_id is indexed so I don't understand why it needs sorting further.

        "filesort_information": [                        
          {                                              
            "direction": "asc",                          
            "table": "intermediate_tmp_table",           
            "field": "**doc_id**"                            
          }                                              
        ],                                               
        "filesort_priority_queue_optimization": {        
          "usable": false,                               
          "cause": "not applicable (no LIMIT)"           
        },                                               
        "filesort_execution": [                          
        ],                                               
        "filesort_summary": {                            
          "rows": 17620,                                 
          "examined_rows": 17620,                        
          "number_of_tmp_files": 2,                      
          "sort_buffer_size": 262128,                    
          "sort_mode": "<sort_key, rowid>"               

I would really appreciate any help on this. Apologies if my understanding is way off.

doc_audit

    PRIMARY KEY (id), 
    KEY DOC_ID_PROPERTY_IDX (doc_id,property), 
    KEY event (instance_id,event,event_time), 
    KEY event_time (doc_id,event,event_time), 
    KEY event_user (instance_id,user_id,event,event_time), 
    KEY event_time2 (event_time,doc_id,event), 
    KEY user (user_id,event_time,instance_id), 
    KEY event_time_user_event (event_time,user_id,event(80))

Adding the table defs:

CREATE TABLE doc_audit (
  id bigint(20) NOT NULL AUTO_INCREMENT,
  doc_id bigint(20) NOT NULL,
  event varchar(255) ,
  event_time datetime NOT NULL,
  instance_id bigint(20) NOT NULL,
  major_version int(11) NOT NULL,
  minor_version int(11) NOT NULL,
  new_value varchar(1500) DEFAULT NULL,
  old_value varchar(1500) DEFAULT NULL,
  property varchar(255)  DEFAULT NULL,
  signature_meaning varchar(255)  DEFAULT NULL,
  task_name varchar(255)  DEFAULT NULL,
  user_id bigint(20) NOT NULL,
  version int(11) DEFAULT NULL,
  workflow_name varchar(255)  DEFAULT NULL,
  user_on_behalf_of bigint(20) DEFAULT NULL,
  xml longtext,
  user_name varchar(255) DEFAULT NULL,
  user_display_name varchar(255) DEFAULT NULL,
  document_number varchar(255) DEFAULT NULL,
  property_public_key varchar(100) DEFAULT NULL,
  property_label varchar(500) DEFAULT NULL,
  old_value_public_key varchar(100) DEFAULT NULL,
  old_value_label varchar(1500) DEFAULT NULL,
  new_value_public_key varchar(100) DEFAULT NULL,
  new_value_label varchar(1500) DEFAULT NULL,
  workflow_label varchar(500) DEFAULT NULL,
  task_label varchar(500) DEFAULT NULL,
  precalculated bit(1) DEFAULT b'0',
  PRIMARY KEY (id),
  KEY DOC_ID_PROPERTY_IDX (doc_id,property),
  KEY event (instance_id,event,event_time),
  KEY event_time (doc_id,event,event_time),
  KEY event_user (instance_id,user_id,event,event_time),
  KEY event_time2 (event_time,doc_id,event),
  KEY user (user_id,event_time,instance_id),
  KEY event_time_user_event (event_time,user_id,event(80))
) ENGINE=InnoDB AUTO_INCREMENT=1209095 DEFAULT CHARSET=utf8

and..

CREATE TEMPORARY TABLE rwfd1213310051e2821b404f29bc453ce7386f764e (
  wfrptid bigint(20) NOT NULL DEFAULT '0',
  workflow_status__v varchar(255) COLLATE utf8_bin DEFAULT NULL,
  workflow_id__v bigint(20) DEFAULT NULL,
  task_capacity_key__v varchar(255) COLLATE utf8_bin DEFAULT NULL,
  workflow_process_version__v bigint(20) DEFAULT NULL,
  workflow_name__v varchar(255) COLLATE utf8_bin DEFAULT NULL,
  workflow_initiator_name__v varchar(255) COLLATE utf8_bin DEFAULT NULL,
  task_delegate__v varchar(255) COLLATE utf8_bin DEFAULT NULL,
  workflow_completionDate__v datetime DEFAULT NULL,
  workflow_initiator__v varchar(255) COLLATE utf8_bin DEFAULT NULL,
  workflow_cancelationDate__v datetime DEFAULT NULL,
  workflow_document_id__v bigint(20) DEFAULT NULL,
  task_verdict_key__v varchar(255) COLLATE utf8_bin DEFAULT NULL,
  workflow_lifecycle__v varchar(255) COLLATE utf8_bin DEFAULT NULL,
  task_assignee_name__v varchar(255) COLLATE utf8_bin DEFAULT NULL,
  task_meta_data_id__v varchar(255) COLLATE utf8_bin DEFAULT NULL,
  task_completionDate__v datetime DEFAULT NULL,
  workflow_startDate__v datetime DEFAULT NULL,
  task_iteration_count__v bigint(20) DEFAULT NULL,
  rptid bigint(20) NOT NULL DEFAULT '0',
  major_version_number__v bigint(20) DEFAULT NULL,
  minor_version_number__v bigint(20) DEFAULT NULL,
  study__v varchar(1500) COLLATE utf8_bin DEFAULT NULL,
  document_number__v varchar(255) COLLATE utf8_bin DEFAULT NULL,
  status__v varchar(1500) COLLATE utf8_bin DEFAULT NULL,
  name__v varchar(100) COLLATE utf8_bin DEFAULT NULL,
  days_in_qc__c bigint(20) DEFAULT NULL,
  durationInValue15f07c5893cf4ba98a3441aab045c460
         varchar(255) COLLATE utf8_bin DEFAULT NULL,
  0PERIOD int(6) DEFAULT NULL,
  KEY idx_tmp_rptid (rptid)
  ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin

Upvotes: 0

Views: 173

Answers (1)

Rick James
Rick James

Reputation: 142560

Two exceptions to the order of tables in EXPLAIN:

  • Subqueries tend to be at the end of the EXPLAIN.
  • Using temporary and Using filesort tend to be on the first line of the EXPLAIN, giving you no clue of when they are actually used. Also, there may be more than one sort, again without a clue.

The EXPLAIN FORMAT=JSON is better at saying what part needed the sort. Multiple filesorts show up clearly.

"filesort" is a generic term. It might involve disk (slowest), it might be a 'quicksort' in RAM, or it might be "filesort_priority_queue_optimization" (handy with LIMIT). There may be other options.

(Please provide SHOW CREATE TABLE.)

event(80) -- Index prefixes are rarely useful. Was this a TEXT column?

The optimal index for this query for a is

INDEX(property, doc_id, event_time, id)

Upvotes: 1

Related Questions