Russian Version

Chapter 10. Lost connection to MySQL server during query

You can see error "Lost connection to MySQL server" not only because too small connect_timeout, but because other reasons too. In this chapter we discuss these reasons.

$php phpconf2009_4.php
string(44) "Lost connection to MySQL server during query"

Most likely error log will show what happened:

Version: '5.1.39' socket: '/tmp/mysql_sandbox5139.sock' port: 5139 MySQL Community Server (GPL)
091002 14:56:54 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.
key_buffer_size=8384512
read_buffer_size=131072
max_used_connections=1
max_threads=151
threads_connected=1

It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 338301 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
thd: 0x69e1b00
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x450890f0 thread_stack 0x40000
/users/ssmirnova/blade12/5.1.39/bin/mysqld(my_print_stacktrace+0x2e)[0x8ac81e]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(handle_segfault+0x322)[0x5df502]
/lib64/libpthread.so.0[0x3429e0dd40]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN6String4copyERKS_+0x16)[0x5d9876]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN14Item_cache_str5storeEP4Item+0xc9)[0x52ddd9]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN26select_singlerow_subselect9send_dataER4ListI4ItemE+0x45)[0x5ca145]
/users/ssmirnova/blade12/5.1.39/bin/mysqld[0x6386d1]
/users/ssmirnova/blade12/5.1.39/bin/mysqld[0x64236a]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN4JOIN4execEv+0x949)[0x658869]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN30subselect_single_select_engine4execEv+0x36c)[0x596f3c]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN14Item_subselect4execEv+0x26)[0x595d96]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN24Item_singlerow_subselect8val_realEv+0xd)[0x595fbd]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN14Arg_comparator18compare_real_fixedEv+0x39)[0x561b89]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN12Item_func_ne7val_intEv+0x23)[0x568fb3]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN4JOIN8optimizeEv+0x12ef)[0x65208f]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0xa0)[0x654850]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_Z13handle_selectP3THDP6st_lexP13select_resultm+0x16c)[0x65a1cc]
/users/ssmirnova/blade12/5.1.39/bin/mysqld[0x5ecbda]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_Z21mysql_execute_commandP3THD+0x602)[0x5efdd2]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_Z11mysql_parseP3THDPKcjPS2_+0x357)[0x5f52f7]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0xe93)[0x5f6193]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_Z10do_commandP3THD+0xe6)[0x5f6a56]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(handle_one_connection+0x246)[0x5e93f6]
/lib64/libpthread.so.0[0x3429e061b5]
/lib64/libc.so.6(clone+0x6d)[0x34292cd39d]`)
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x6a39e60 = select 1 from `t1` where `c0` <> (select geometrycollectionfromwkb(`c3`) from `t1`)
thd->thread_id=2
thd->killed=NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what

As we can see MySQL server crashed due to system signal 11 ( mysqld got signal 11). This means MySQL server asked operating system for a resource (for example, access to a file or RAM), but got reject with code 11. In most of operating systmes this signal means Segmentation fault - access to RAM was rejected. More detailed information you find in user manual for your operating system. For UNIX this is `man signal`. In Windows you usually get record similar to "mysqld got exception 0xc0000005". Search Windows user manual for exact meaning of exception codes.

You also can get error similar to:

100828 21:52:31 InnoDB: Error: cannot allocate 8589950976 bytes of
InnoDB: memory with malloc! Total allocated memory
InnoDB: by InnoDB 46483728 bytes. Operating system errno: 12

"Operating system errno: 12" is operating system error. Numbers of operating system errors are not eveywhere same and can be different from system to system. If you want to know what particular error code means at your operating system use perror utility which is part of MySQL installation and placed in the bin directory. For example, you can see ouptut of perror for my MacOSX installation:

$perror 12
OS error code 12: Cannot allocate memory

Below we see backtrace (starting from "Attempting backtrace.") We will discuss backtrace later.

Still below we can see the query which caused the problem:

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x6a39e60 = select 1 from `t1` where `c0` <> (select geometrycollectionfromwkb(`c3`) from `t1`)

Problematic query is

select 1 from `t1` where `c0` <> (select geometrycollectionfromwkb(`c3`) from `t1`)

Lets try to repeat the problem in mysql cli

$./my sql
mysql [localhost] {msandbox} ((none)) > use test
Database changed
mysql [localhost] {msandbox} (test) > select 1 from `t1` where `c0` <> (select geometrycollectionfromwkb(`c3`) from `t1`);
ERROR 2013 (HY000): Lost connection to MySQL server during query
mysql [localhost] {msandbox} (test) > \q
Bye

This means this is the bug which can be repeated every time. In this case we should change application in such a way so bug will be avoided until it is fixed.

In this case backtrace will help us.

/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN14Item_subselect4execEv+0x26)[0x595d96]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN24Item_singlerow_subselect8val_realEv+0xd)[0x595fbd]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN14Arg_comparator18compare_real_fixedEv+0x39)[0x561b89]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN12Item_func_ne7val_intEv+0x23)[0x568fb3]
/users/ssmirnova/blade12/5.1.39/bin/mysqld(_ZN4JOIN8optimizeEv+0x12ef)[0x65208f]

It contains following calls: Item_subselect and Item_singlerow_subselect. From here – even without looking in MySQL code – we can decide what usage of subquery causes the crash.

Lets try to rewrite the query

$./my sql
mysql [localhost] {msandbox} (test) > select 1 from `t1` where `c0` <> geometrycollectionfromwkb(`c3`);
Empty set (0.00 sec)

MySQL server works fine! We can use this rewritten query until bug is fixed.

Method #18: always use error log

But sometimes error log does not contain necessary information

Same query, but on my Mac box

091002 16:49:48 - mysqld got signal 10 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=8384512
read_buffer_size=131072
max_used_connections=1
max_connections=100
threads_connected=1
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 225784 K

As we can see there is no backtrace, neither query in the error log. What can we do?

In this case, as earlier, general query log will help us. MySQL firstly writes query to the general query log and only later executes it. Therefore we can use such a method for repeatable problem queries.

mysql> set global general_log=1;
Query OK, 0 rows affected (0.00 sec)
mysql> set global log_output='table';
Query OK, 0 rows affected (0.00 sec)

Run test. After server restart check general query log:

mysql> select argument from mysql.general_log order by event_time desc limit 10;
+--------------------------------------------+
| argument                                   |
+--------------------------------------------+
| Access denied for user 'MySQL_Instance_Manager'@'localhost' (using password: YES)                             |
| select 1 from `t1` where `c0` <>  (select geometrycollectionfromwkb(`c3`) from `t1`) |

Query, which caused crash, has been found!

Mehtod #19: use general query log if error log does not contain enough information about server crash.

While using this method there is a chance what table mysql.general_log can be crashed in time when MySQL server crashes. In this case try to use logging to file.

There is also a chance what MySQL server will stop working when it will be writing query into general query log. In such a case either use log files in your application or proxy.

Example, which we discussed above, based on MySQL server bug. But MySQL server can be stopped because out of resources in the operating system.

First thing you need to examine is RAM

This is the quote from real error log file:

key_buffer_size=235929600
read_buffer_size=4190208
max_used_connections=17
max_connections=2048
threads_connected=13
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_connections = 21193712 K
-----
21193712K ~= 20G

This shows MySQL can use up to 20G RAM! There are powerful boxes now, but it is worth checking if you really have 20G RAM.

Method #20: always check if you have enough RAM for allocated buffers.

Also pay attention for system variable max_connections.

In the example above there was max_connections=2048. This is quite high. Check if you have enough resources for such amount of simultaneous connections.

I often met cases when users set value of max_connections much higher than their servers can handle. This leads to unexpected crashes of MySQL server when load on web-resources suddenly become high.

Method #21: set realistic value of max_connections based on your operating system resources.

MySQL server can also meet shortage of other resources. Usually such information exists in the error log file. Analyze this information and solve the problem.

But MySQL server itself not always the cause of shortage of resources. Can happen what other application uses resources needed by MySQL server. In this cases use operating system means for monitoring to find the application which causes such shortage.

Method #22: use monitoring tools of your operating system to find which application use enourmous amount of resources which lead to crashes of MySQL server.

As we disucssed earlier message "Lost connection to MySQL server" can also mean timeout. If error log doesn't contain other errors or you suspect such a case add option log_warnings=2 into configuration file and check error log file after you got the message.

Mehtod #23: Use option log_warnings=2 to examine if you have rejected connections.

Sometimes error only happens if many queries run concurrently. Use additional programs and general query log to find which queries should run. I would not describe how to find and solve such errors, because this is difficult and requires special solution for each case.

Back Content Forward



Author 2010 Sveta Smirnova
COPYRIGHT © 2010 S.Smirnova and S. Lasunov
sveta_at_js-client_dot_com