Reputation: 904
The following query is taking forever in MySQL 5.7.23, on a macOS 10.13.6 on 2018 Macbook Pro (A1990) with 50GB of free space:
INSERT INTO `a_sg1lib` (`book_id`,`title`, `isbn`, `Zstatus_retrieve_TOC`, `Zstatus_retrieve_classifybyoclc`,
`classifybyoclc_respcode`, `classifybyoclc_calln_lcc`, `classifybyoclc_calln_ddc`, `classifybyoclc_calln_nlm`,
`classifybyoclc_fast`, `classifybyoclc_owi`,`reference`, `data`, `crcomment`, `groupname`, `code`, `indentation`, `path`, `specialtag`,
`specialtag_cat`, `specialtag_master`, `specialtag_master_cat`, `specialtag_override_cat_moddetails`,
`specialtag_override_cat_timestamp`, `specialtag_override_cat`, `pageno`, `specialtag_escalated`, `history`,
`history_classification`, `specialtag_esc_start`, `specialtag_esc_start_type`, `callCore`,
`d1`, `d2`, `d3`, `d4`, `d5`, `d6`, `d7`, `d8`, `d9`, `d10`, `d11`, `d12`)
SELECT `RU_sg1lib_classifybyoclc`.`ID`,`Title`, `IdentifierWODash`, `Zstatus_retrieve_TOC`, `Zstatus_retrieve_classifybyoclc`, `classifybyoclc_respcode`, `classifybyoclc_calln_lcc`,
`classifybyoclc_calln_ddc`, `classifybyoclc_calln_nlm`,
`classifybyoclc_fast`, `classifybyoclc_owi`,`reference`, `data`,
`crcomment`, `groupname`, `code`, `indentation`, `path`, `specialtag`, `specialtag_cat`,
`specialtag_master`, `specialtag_master_cat`, `specialtag_override_cat_moddetails`,
`specialtag_override_cat_timestamp`, `specialtag_override_cat`, `pageno`, `specialtag_escalated`, `history`,
`history_classification`, `specialtag_esc_start`, `specialtag_esc_start_type`, `callCore`,
`d1`, `d2`, `d3`, `d4`, `d5`, `d6`, `d7`, `d8`, `d9`, `d10`, `d11`, `d12`
FROM `RU_sg1lib_classifybyoclc`
RIGHT JOIN `loc_classification`.`LOC_Classification_Text_zFULL_YYY`
ON `RU_sg1lib_classifybyoclc`.`classifybyoclc_calln_lcc`=
`loc_classification`.`LOC_Classification_Text_zFULL_YYY`.`callCore`;
Number of rows:
RU_sg1lib_classifybyoclc - 1+ million rows - MyISAM
LOC_Classification_Text_zFULL_YYY
- 440000+ rows - InnoDB
a_sg1lib
- InnoDB, initially empty
Schema:
CREATE TABLE `RU_sg1lib_classifybyoclc` (
`ID` int(15) unsigned NOT NULL AUTO_INCREMENT,
`Title` varchar(2000) DEFAULT '',
`VolumeInfo` varchar(100) DEFAULT '',
`Series` varchar(300) DEFAULT '',
`Periodical` varchar(200) DEFAULT '',
`Author` varchar(1000) DEFAULT '',
`Year` varchar(14) DEFAULT '',
`Edition` varchar(60) DEFAULT '',
`Publisher` varchar(400) DEFAULT '',
`City` varchar(100) DEFAULT '',
`Pages` varchar(100) DEFAULT '',
`PagesInFile` int(10) unsigned NOT NULL DEFAULT '0',
`Language` varchar(150) DEFAULT '',
`Topic` varchar(500) DEFAULT '',
`Library` varchar(50) DEFAULT '',
`Issue` varchar(100) DEFAULT '',
`Identifier` varchar(300) DEFAULT '',
`ISSN` varchar(9) DEFAULT '',
`ASIN` varchar(200) DEFAULT '',
`UDC` varchar(200) DEFAULT '',
`LBC` varchar(200) DEFAULT '',
`DDC` varchar(45) DEFAULT '',
`LCC` varchar(45) DEFAULT '',
`Doi` varchar(45) DEFAULT '',
`Googlebookid` varchar(45) DEFAULT '',
`OpenLibraryID` varchar(200) DEFAULT '',
`Commentary` varchar(10000) DEFAULT '',
`DPI` int(6) unsigned DEFAULT '0',
`Color` varchar(1) DEFAULT '',
`Cleaned` varchar(1) DEFAULT '',
`Orientation` varchar(1) DEFAULT '',
`Paginated` varchar(1) DEFAULT '',
`Scanned` varchar(1) DEFAULT '',
`Bookmarked` varchar(1) DEFAULT '',
`Searchable` varchar(1) DEFAULT '',
`Filesize` bigint(20) unsigned NOT NULL DEFAULT '0',
`Extension` varchar(50) DEFAULT '',
`MD5` char(32) DEFAULT '',
`Generic` char(32) DEFAULT '',
`Visible` char(3) DEFAULT '',
`Locator` varchar(733) DEFAULT '',
`Local` int(10) unsigned DEFAULT '0',
`TimeAdded` timestamp NOT NULL DEFAULT '2000-01-01 13:00:00',
`TimeLastModified` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`Coverurl` varchar(200) DEFAULT '',
`Tags` varchar(500) DEFAULT '',
`IdentifierWODash` varchar(300) DEFAULT '',
`Zstatus_retrieve_TOC` varchar(255) DEFAULT NULL,
`Zstatus_retrieve_classifybyoclc` varchar(255) DEFAULT NULL,
`classifybyoclc_respcode` text,
`classifybyoclc_calln_lcc` varchar(256) DEFAULT NULL,
`classifybyoclc_calln_ddc` text,
`classifybyoclc_calln_nlm` text,
`classifybyoclc_fast` text,
`classifybyoclc_owi` text,
`classifybyoclc_timest` text,
`pid` text,
`ida` int(11) DEFAULT NULL,
PRIMARY KEY (`ID`),
UNIQUE KEY `MD5` (`MD5`),
KEY `Generic` (`Generic`) USING BTREE,
KEY `VisibleTimeAdded` (`Visible`,`TimeAdded`) USING BTREE,
KEY `TimeAdded` (`TimeAdded`) USING BTREE,
KEY `Topic` (`Topic`(3)) USING BTREE,
KEY `VisibleID` (`Visible`,`ID`) USING BTREE,
KEY `VisibleTimeLastModified` (`Visible`,`TimeLastModified`,`ID`) USING BTREE,
KEY `TimeLastModifiedID` (`TimeLastModified`,`ID`) USING BTREE,
KEY `DOI_INDEX` (`Doi`) USING BTREE,
KEY `Identifier` (`Identifier`),
KEY `classifybyoclc_calln_lcc` (`classifybyoclc_calln_lcc`),
KEY `classifybyoclc_fast` (`classifybyoclc_fast`(300)),
FULLTEXT KEY `Title` (`Title`),
FULLTEXT KEY `Author` (`Author`),
FULLTEXT KEY `Language` (`Language`),
FULLTEXT KEY `Extension` (`Extension`),
FULLTEXT KEY `Publisher` (`Publisher`),
FULLTEXT KEY `Series` (`Series`),
FULLTEXT KEY `Year` (`Year`),
FULLTEXT KEY `Title1` (`Title`,`Author`,`Series`,`Publisher`,`Year`,`Periodical`,`VolumeInfo`),
FULLTEXT KEY `Tags` (`Tags`),
FULLTEXT KEY `Identifierfulltext` (`IdentifierWODash`)
) ENGINE=InnoDB AUTO_INCREMENT=3246566 DEFAULT CHARSET=utf8
CREATE TABLE `LOC_Classification_Text_zFULL_YYY` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`reference` longtext COLLATE utf8mb4_unicode_ci,
`data` longtext COLLATE utf8mb4_unicode_ci,
`crcomment` longtext COLLATE utf8mb4_unicode_ci,
`groupname` longtext COLLATE utf8mb4_unicode_ci,
`code` longtext COLLATE utf8mb4_unicode_ci,
`indentation` int(10) unsigned DEFAULT NULL,
`path` longtext COLLATE utf8mb4_unicode_ci,
`specialtag` longtext COLLATE utf8mb4_unicode_ci,
`specialtag_cat` mediumtext COLLATE utf8mb4_unicode_ci,
`specialtag_master` mediumtext COLLATE utf8mb4_unicode_ci,
`specialtag_master_cat` mediumtext COLLATE utf8mb4_unicode_ci,
`specialtag_override_cat_moddetails` mediumtext COLLATE utf8mb4_unicode_ci,
`specialtag_override_cat_timestamp` mediumtext COLLATE utf8mb4_unicode_ci,
`specialtag_override_cat` mediumtext COLLATE utf8mb4_unicode_ci,
`pageno` longtext COLLATE utf8mb4_unicode_ci,
`specialtag_escalated` longtext COLLATE utf8mb4_unicode_ci,
`history` longtext COLLATE utf8mb4_unicode_ci,
`history_classification` longtext COLLATE utf8mb4_unicode_ci,
`specialtag_esc_start` longtext COLLATE utf8mb4_unicode_ci,
`specialtag_esc_start_type` longtext COLLATE utf8mb4_unicode_ci,
`callCore` varchar(256) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`d1` longtext COLLATE utf8mb4_unicode_ci,
`d2` longtext COLLATE utf8mb4_unicode_ci,
`d3` longtext COLLATE utf8mb4_unicode_ci,
`d4` longtext COLLATE utf8mb4_unicode_ci,
`d5` longtext COLLATE utf8mb4_unicode_ci,
`d6` longtext COLLATE utf8mb4_unicode_ci,
`d7` longtext COLLATE utf8mb4_unicode_ci,
`d8` longtext COLLATE utf8mb4_unicode_ci,
`d9` longtext COLLATE utf8mb4_unicode_ci,
`d10` longtext COLLATE utf8mb4_unicode_ci,
`d11` longtext COLLATE utf8mb4_unicode_ci,
`d12` longtext COLLATE utf8mb4_unicode_ci,
PRIMARY KEY (`id`),
KEY `indentation` (`indentation`),
KEY `callCore` (`callCore`),
FULLTEXT KEY `specialtag` (`specialtag`)
) ENGINE=InnoDB AUTO_INCREMENT=496218 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci |
| a_sg1lib | CREATE TABLE `a_sg1lib` (
`ida` int(11) DEFAULT '0',
`classifybyoclc_respcode` text CHARACTER SET utf8mb4,
`classifybyoclc_timest` text CHARACTER SET utf8mb4,
`isbn` text CHARACTER SET utf8mb4,
`choseStatus` varchar(256) CHARACTER SET utf8mb4 DEFAULT NULL,
`choseStatus2` varchar(1000) CHARACTER SET utf8mb4 DEFAULT NULL,
`choseStatusSpecial` varchar(256) CHARACTER SET utf8mb4 DEFAULT NULL,
`choseStatusMarker` varchar(256) CHARACTER SET utf8mb4 DEFAULT NULL,
`title` text CHARACTER SET utf8mb4,
`subtitle` text CHARACTER SET utf8mb4,
`publisher_name` text CHARACTER SET utf8mb4,
`imprint_name` text CHARACTER SET utf8mb4,
`publication_date` text CHARACTER SET utf8mb4,
`edition_number` text CHARACTER SET utf8mb4,
`authors` text CHARACTER SET utf8mb4,
`editors` text CHARACTER SET utf8mb4,
`others` text CHARACTER SET utf8mb4,
`contributors` text CHARACTER SET utf8mb4,
`is_activated` text CHARACTER SET utf8mb4,
`public_url` text CHARACTER SET utf8mb4,
`date_added` text CHARACTER SET utf8mb4,
`format` text CHARACTER SET utf8mb4,
`rating` text CHARACTER SET utf8mb4,
`engagement_score` text CHARACTER SET utf8mb4,
`university_list_count` text CHARACTER SET utf8mb4,
`published_list_count` text CHARACTER SET utf8mb4,
`award_count` text CHARACTER SET utf8mb4,
`mobile_disabled` text CHARACTER SET utf8mb4,
`categories` text CHARACTER SET utf8mb4,
`year` text CHARACTER SET utf8mb4,
`subjects` text CHARACTER SET utf8mb4,
`topics` text CHARACTER SET utf8mb4,
`topics_facet_filter` text CHARACTER SET utf8mb4,
`topics_detailed` text CHARACTER SET utf8mb4,
`main_subject` text CHARACTER SET utf8mb4,
`main_topic` text CHARACTER SET utf8mb4,
`keywords` text CHARACTER SET utf8mb4,
`description` text CHARACTER SET utf8mb4,
`language_id` text CHARACTER SET utf8mb4,
`language` varchar(256) CHARACTER SET utf8mb4 DEFAULT NULL,
`sales_rights` text CHARACTER SET utf8mb4,
`cover_image` text CHARACTER SET utf8mb4,
`related_isbns` text CHARACTER SET utf8mb4,
`file_size` text CHARACTER SET utf8mb4,
`mobile_disabled_v3` text CHARACTER SET utf8mb4,
`is_restricted` text CHARACTER SET utf8mb4,
`organisation_list` text CHARACTER SET utf8mb4,
`objectID` text CHARACTER SET utf8mb4,
`_highlightResult` text CHARACTER SET utf8mb4,
`chapters` text CHARACTER SET utf8mb4,
`Zstatus_retrieve_TOC` text CHARACTER SET utf8mb4,
`Zstatus_retrieve_classifybyoclc` text CHARACTER SET utf8mb4,
`pid` text CHARACTER SET utf8mb4,
`ida2` int(11) DEFAULT NULL,
`withdrawal_date` text CHARACTER SET utf8mb4,
`book_id` int(13) DEFAULT NULL,
`classifybyoclc_calln_lcc` varchar(256) CHARACTER SET utf8mb4 DEFAULT NULL,
`classifybyoclc_calln_ddc` text CHARACTER SET utf8mb4,
`classifybyoclc_calln_nlm` text CHARACTER SET utf8mb4,
`classifybyoclc_fast` varchar(960) CHARACTER SET utf8mb4 DEFAULT NULL,
`classifybyoclc_owi` text CHARACTER SET utf8mb4,
`classifybyoclc_fast_max` varchar(512) CHARACTER SET utf8mb4 DEFAULT NULL,
`id` int(10) unsigned DEFAULT '0',
`reference` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`data` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`crcomment` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`groupname` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`code` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`indentation` int(10) unsigned DEFAULT NULL,
`path` varchar(1000) DEFAULT NULL,
`specialtag` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`specialtag_cat` text CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`specialtag_master` text CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`specialtag_master_cat` text CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`specialtag_override_cat_moddetails` text CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`specialtag_override_cat_timestamp` text CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`specialtag_override_cat` text CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`pageno` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`specialtag_escalated` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`history` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`history_classification` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`specialtag_esc_start` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`specialtag_esc_start_type` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`callCore` varchar(256) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT NULL,
`d1` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`d2` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`d3` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`d4` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`d5` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`d6` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`d7` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`d8` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`d9` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`d10` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`d11` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`d12` longtext CHARACTER SET utf8 COLLATE utf8_unicode_ci,
`idmaster` int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`idmaster`),
KEY `choseStatus` (`choseStatus`),
KEY `classifybyoclc_calln_lcc` (`classifybyoclc_calln_lcc`),
KEY `book_id` (`book_id`),
KEY `path` (`path`),
KEY `choseStatusSpecial` (`choseStatusSpecial`),
KEY `choseStatusMarker` (`choseStatusMarker`),
KEY `choseStatus2` (`choseStatus2`(256)) USING BTREE,
FULLTEXT KEY `title` (`title`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
I thought what makes the query so slow is the fact that one of the tables is MyISAM, and tried to change that, but even that takes forever. Altering the columns involved in the join from TEXT to VARCHAR also takes forever ;(;(. Basically I am stuck ;(
May I know what is wrong, and how can I make this faster?
Upvotes: 0
Views: 203
Reputation: 108641
What's going on here to make this slow? A few things.
You have big tables and no WHERE clause, so this statement handles a lot of data. It's never going to be sub-minute fast.
Your tables have many CLOB (mediumtext, longtext) columns. The way MySQL stores those columns makes retrieving and storing them surprisingly expensive. Your query insists on retrieving them all and storing them again.
You correctly identified the mix of MyISAM and InnoDB tables as a possible performance problem. But, your table definitions show InnoDB tables, so you may have successfully converted one of them. That's good.
You're joining on this:
ON `RU_sg1lib_classifybyoclc`.`classifybyoclc_calln_lcc`
= `loc_classification`.`LOC_Classification_Text_zFULL_YYY`.`callCore`;
It's very important to performance that the columns in your ON-clauses have precisely the same definition.
The declaration of classifybyoclc_calln_lcc
in RU_sg1lib_classifybyoclc
lacks both its own COLLATE clause and a default COLLATE for the table, so its collation is unknown. It is the database's default, which we don't know.
varchar(256) CHARACTER SET utf8mb4 DEFAULT NULL
The declaration of callCore
in LOC_Classification_Text_zFULL_YYY
is this.
varchar(256) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
If your database default collation is also utf8mb4_unicode_ci
you should be in good shape. Otherwise alter the first table to specify the collation for classifybyoclc_calln_lcc
explicitly.
It's possible you could store those OCLC call numbers using CHARACTER SET latin1 COLLATE latin1_bin
. They may not need to be case-insensitive and they may not contain characters outside the ASCII and western European character sets. If your call numbers work like that, you might alter your tables to use the more efficient collation. Latin-1 text takes a quarter as many bytes to store in indexes as utfmb4 text. If you need case-insensitive searching on the call numbers, use COLLATE latin1_general_ci
.
RIGHT JOIN
is just plain weird. Almost everybody uses LEFT JOIN
, reversing the order of your two tables. Do you really need an outer join like that? If you don't need an outer join, use an ordinary INNER
join.
What can you do about this? You didn't tell us why you need that a_sg1lib
table. If you're using it with queries containing WHERE clauses, to find just a few rows at a time, you could create a VIEW rather than a TABLE. Each lookup will then use the indexes on your existing tables, and you won't have to copy the data.
Upvotes: 0
Reputation: 2343
forgodsakehold, We should talk if this is not clear. View my profile for contact info, please.
Observation 1,
RU_sg1lib_classifybyoclc.ID appears to be INT(15) unsigned
SELECTED for push into a_sg11lib table as book_id
and
a_sg1lib.book_id appears to be simple INT(13) (NOT unsigned and diff lengths)
most often we see same column attributes when moving data.
Observation 2, RIGHT JOIN of calln_lcc = to callCore
RU_sg1lib_classifbyoclc.classifybyoclc_calln_lcc is simple varchar(256)
and
LOC_Classification_Text_zFULL_YYY.callcore is varchar(256) COLLATE utf8mb4_unicode_ci
most often we see matched column definitions for left and right side of =.
Upvotes: 0
Reputation: 142218
SHOW CREATE TABLE
is more descriptive than DESCRIBE
. In particular, I need to see if the indexes are composite and, if so, what order the columns are in.
RU_sg1lib_classifybyoclc: INDEX(classifybyoclc_calln_lcc)
If it turns out that that is really a JOIN
instead of a RIGHT JOIN
, then this might be useful:
LOC_Classification_Text_zFULL_YYY: INDEX(callCore)
I suspect that most of the columns are declared larger than necessary. Shrinking them may help some. For example, the 8-byte BIGINT
is usually bigger than will ever be needed. year
and some ids
are unnecessarily (and inefficiently) TEXT
.
Upvotes: 1