Michael Zigldrum
Michael Zigldrum

Reputation: 256

MySQL 8.0.23 crashes with Segfault on Select query

I hope someone might have additional ideas on what is going wrong and how we could solve it. Our MySQL Database keeps crashing due to Signal 11 (Segfault) from the OS.

Quick overview of the server:

Centos 7

MySQL Server Community Edition 8.0.23 via the official mysql repository

4 cores

8GB RAM

100GB HDD with 85GB free space

We started to get crashes with the following log:

15:09:23 UTC - mysqld got signal 11 ;
Most likely, you have hit a bug, but this error can also be caused by malfunctioning hardware.
Thread pointer: 0x7f678c000d20
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f6800141c30 thread_stack 0x46000
/usr/sbin/mysqld(my_print_stacktrace(unsigned char const*, unsigned long)+0x3d) [0x21e102d]
/usr/sbin/mysqld(handle_fatal_signal+0x313) [0x1009643]
/lib64/libpthread.so.0(+0xf5d0) [0x7f68181975d0]
/usr/sbin/mysqld() [0x24c7e0e]
/usr/sbin/mysqld(trx_undo_update_rec_get_update(unsigned char const*, dict_index_t const*, unsigned long, unsigned long, unsigned long, unsigned long, trx_t*, mem_block_info_t*, upd_t**, lob::undo_vers_t*, type_cmpl_t&)+0x7bb) [0x24c9d3b]
/usr/sbin/mysqld(trx_undo_prev_version_build(unsigned char const*, mtr_t*, unsigned char const*, dict_index_t const*, unsigned long*, mem_block_info_t*, unsigned char**, mem_block_info_t*, dtuple_t const**, unsigned long, lob::undo_vers_t*)+0x43d) [0x24cb7cd]
/usr/sbin/mysqld(row_vers_build_for_consistent_read(unsigned char const*, mtr_t*, dict_index_t*, unsigned long**, ReadView*, mem_block_info_t**, mem_block_info_t*, unsigned char**, dtuple_t const**, lob::undo_vers_t*)+0x279) [0x2476cb9]
/usr/sbin/mysqld(row_search_mvcc(unsigned char*, page_cur_mode_t, row_prebuilt_t*, unsigned long, unsigned long)+0x2b34) [0x2462c14]
/usr/sbin/mysqld(ha_innobase::index_read(unsigned char*, unsigned char const*, unsigned int, ha_rkey_function)+0x29a) [0x22c7e2a]
/usr/sbin/mysqld(handler::ha_index_read_map(unsigned char*, unsigned char const*, unsigned long, ha_rkey_function)+0x347) [0x11184e7]
/usr/sbin/mysqld() [0xe8c415]
/usr/sbin/mysqld(join_read_const_table(JOIN_TAB*, POSITION*)+0x87) [0xe8c4d7]
/usr/sbin/mysqld(JOIN::extract_func_dependent_tables()+0x4c3) [0xeaff03]
/usr/sbin/mysqld(JOIN::make_join_plan()+0x10e7) [0xec0fb7]
/usr/sbin/mysqld(JOIN::optimize()+0xbcb) [0xec247b]
/usr/sbin/mysqld(SELECT_LEX::optimize(THD*)+0xb6) [0xf1ebe6]
/usr/sbin/mysqld(SELECT_LEX_UNIT::optimize(THD*, TABLE*, bool)+0x7b) [0xf92dab]
/usr/sbin/mysqld(Sql_cmd_dml::execute_inner(THD*)+0x3d) [0xf1d63d]
/usr/sbin/mysqld(Sql_cmd_dml::execute(THD*)+0x525) [0xf272e5]
/usr/sbin/mysqld(mysql_execute_command(THD*, bool)+0x9f0) [0xecb7a0]
/usr/sbin/mysqld(dispatch_sql_command(THD*, Parser_state*)+0x42b) [0xecff7b]
/usr/sbin/mysqld(dispatch_command(THD*, COM_DATA const*, enum_server_command)+0x1ed6) [0xed22f6]
/usr/sbin/mysqld(do_command(THD*)+0x19c) [0xed302c]
/usr/sbin/mysqld() [0xffa6b8]
/usr/sbin/mysqld() [0x2782d4e]
/lib64/libpthread.so.0(+0x7dd5) [0x7f681818fdd5]
/lib64/libc.so.6(clone+0x6d) [0x7f6816575ead]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f678c009a68): SELECT job_project_id AS projectid, user_id_fk AS userid, job_name AS jobname      FROM job WHERE job_id=2463
Connection ID (thread ID): 9
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

Reading through the logs it seems that the culprit is the following query:

SELECT job_project_id AS projectid, user_id_fk AS userid,
       job_name AS jobname
    FROM job
    WHERE job_id=2463

The table job has the following columns:

Columns:
job_id int AI PK 
job_name varchar(50) 
job_status json 
job_fileset_list json 
user_id_fk int 
job_project_id int 
job_start_time datetime 
job_end_time datetime 
job_pipeline json 
job_task_blobs json 
job_exe_blobs json

The my.cnf is:

# For advice on how to change settings please see
# http://dev.mysql.com/doc/refman/8.0/en/server-configuration-defaults.html

[mysqld]
#
# Remove leading # and set to the amount of RAM for the most important data
# cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%.
innodb_buffer_pool_size = 5500M
#
# Remove the leading "# " to disable binary logging
# Binary logging captures changes between backups and is enabled by
# default. It's default setting is log_bin=binlog
# disable_log_bin
#
# Remove leading # to set options mainly useful for reporting servers.
# The server defaults are faster for transactions and fast SELECTs.
# Adjust sizes as needed, experiment to find the optimal values.
# join_buffer_size = 128M
# sort_buffer_size = 2M
# read_rnd_buffer_size = 2M
#
# Remove leading # to revert to previous value for default_authentication_plugin,
# this will increase compatibility with older clients. For background, see:
# https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_default_authentication_plugin
# default-authentication-plugin=mysql_native_password

datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock

log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid


max_connections=500

We have only ever seen this error with above query. This segfault occurs semi-regularly, where we can run the query fine a couple of times and then it segfaults. We tried running the queries sequentially, no difference. We tried running them with breaks of 1 seconds, no difference.

The server definitely does not go out of memory, there is enough around when the segfault comes along.

I am completely out of ideas and I also cannot find people with the same problem. The closest I found was a bug report

however this is for a much older version of mysql, and did not offer any resolution.

If anyone has any ideas, hunches or even had this problem themselves before, I would very much appreciate your input!

Upvotes: 1

Views: 1629

Answers (1)

Michael Zigldrum
Michael Zigldrum

Reputation: 256

So this is not really a solution to the bug/problem itself.

But thanks to Don we have narrowed the cause down to most probably be MySQL itself. We have switched to MariaDB and the problem does not occur anymore. We will stick with MariaDB for the forseeable future.

Upvotes: 2

Related Questions