Reputation: 115
Running Django 1.3 over wsgi on apache 2.2.9/Debian, and using mysql 5.0.51a I encountered the following problem, both in the deployed django installation and in both development servers we had running, using 2 databases.
For every user, a certain function resulted in a 2006: server has gone away error. The next to last function called before things start to go wrong is in a thread, like:
t = threading.Thread(target = logic.report,
args = [proj_info, userdata]
)
t.setDaemon(True)
t.start()
It's something happening in the background while the GUI (the django website) continues to be available. This is something that has been going well for tens of dozens of times, but this afternoon it failed.
The error Django returned (through the browser and of course I forgot to keep that tab open for additional info) pointed to something happening at 14:15:18, so here is some mysql.log
120202 14:15:18 3449 Connect beta@localhost on beta2db
3449 Query SET NAMES utf8
3449 Query set autocommit=0
3449 Query SELECT `auth_user`.`id`, `auth_user`.`username`, `auth_user`.`first_name`, `auth_user`.`last_name`, `auth_user`.`email`, `auth_user`.`password`, `auth_user`.`is_staff`, `auth_user`.`is_active`, `auth_user`.`is_superuser`, `auth_user`.`last_login`, `auth_user`.`date_joined` FROM `auth_user` WHERE `auth_user`.`id` = 3
3449 Query SELECT `insight_test_run`.`id`, `insight_test_run`.`rawfile`, `insight_test_run`.`koekfile`, `insight_test_run`.`measure_date`, `insight_test_run`.`test_id`, `insight_test_run`.`subject_id`, `insight_test_run`.`quality` FROM `insight_test_run` WHERE `insight_test_run`.`id` = 514
3449 Query SELECT `insight_project`.`id`, `insight_project`.`client`, `insight_project`.`description`, `insight_project`.`directory` FROM `insight_project` INNER JOIN `insight_project_test_run` ON (`insight_project`.`id` = `insight_project_test_run`.`project_id`) WHERE `insight_project_test_run`.`test_run_id` = 514
3449 Query SELECT `auth_user`.`id`, `auth_user`.`username`, `auth_user`.`first_name`, `auth_user`.`last_name`, `auth_user`.`email`, `auth_user`.`password`, `auth_user`.`is_staff`, `auth_user`.`is_active`, `auth_user`.`is_superuser`, `auth_user`.`last_login`, `auth_user`.`date_joined` FROM `auth_user` INNER JOIN `insight_project_user` ON (`auth_user`.`id` = `insight_project_user`.`user_id`) WHERE `insight_project_user`.`project_id` = 6
3449 Query SELECT `django_content_type`.`app_label`, `auth_permission`.`codename` FROM `auth_permission` INNER JOIN `auth_group_permissions` ON (`auth_permission`.`id` = `auth_group_permissions`.`permission_id`) INNER JOIN `auth_group` ON (`auth_group_permissions`.`group_id` = `auth_group`.`id`) INNER JOIN `auth_user_groups` ON (`auth_group`.`id` = `auth_user_groups`.`group_id`) INNER JOIN `django_content_type` ON (`auth_permission`.`content_type_id` = `django_content_type`.`id`) WHERE `auth_user_groups`.`user_id` = 3
3449 Query rollback
3449 Query SELECT `insight_subject`.`id`, `insight_subject`.`name`, `insight_subject`.`nice_name`, `insight_subject`.`handedness`, `insight_subject`.`birthday`, `insight_subject`.`gender`, `insight_subject`.`education`, `insight_subject`.`eyecorrection`, `insight_subject`.`extra` FROM `insight_subject` WHERE `insight_subject`.`id` = 10000456
3449 Query SELECT `insight_test`.`id`, `insight_test`.`description`, `insight_test`.`level_id`, `insight_test`.`name` FROM `insight_test` WHERE `insight_test`.`id` = 1
3449 Query SELECT `insight_test_level`.`id`, `insight_test_level`.`description`, `insight_test_level`.`official_name` FROM `insight_test_level` WHERE `insight_test_level`.`id` = 1
3449 Quit
Here is a bit of apache log (/var/log/apache2/error.log):
[Thu Feb 02 14:17:00 2012] [error] Exception in thread Thread-2:
[Thu Feb 02 14:17:00 2012] [error] Traceback (most recent call last):
[Thu Feb 02 14:17:00 2012] [error] File "/usr/local/lib/python2.7/threading.py", line 530, in __bootstrap_inner
[Thu Feb 02 14:17:00 2012] [error] self.run()
[Thu Feb 02 14:17:00 2012] [error] File "/usr/local/lib/python2.7/threading.py", line 483, in run
[Thu Feb 02 14:17:00 2012] [error] self.__target(*self.__args, **self.__kwargs)
[Thu Feb 02 14:17:00 2012] [error] File "/usr/local/www/wsgi-scripts/portal/ovl_webinterface/insight/logic.py", line 50, in report
[Thu Feb 02 14:17:00 2012] [error] reportfile = ovl.report.make_report(django_test_run_id = j, img = trunfo['img'], text_style = trunfo['style'], anonymous = anon)
[Thu Feb 02 14:17:00 2012] [error] File "/usr/local/lib/python2.7/site-packages/ovl_analystT/reporting.py", line 90, in make_report
[Thu Feb 02 14:17:00 2012] [error] info = dbman.gather_all_test_run_info(django_test_run_id = django_test_run_id)
[Thu Feb 02 14:17:00 2012] [error] File "/usr/local/lib/python2.7/site-packages/ovl_analystT/dbconnector.py", line 802, in gather_all_test_run_info
[Thu Feb 02 14:17:00 2012] [error] test_run = self.get_table_rows2(table = 'test_run', convert_to_one = True, django_id = django_test_run_id)
[Thu Feb 02 14:17:00 2012] [error] File "/usr/local/lib/python2.7/site-packages/warehouseT/manager.py", line 213, in get_table_rows2
[Thu Feb 02 14:17:00 2012] [error] cols = kwargs.pop('columns',self.get_table_column_names(table))#if columns are specified (using keyword 'columns') it only loads those.
[Thu Feb 02 14:17:00 2012] [error] File "/usr/local/lib/python2.7/site-packages/warehouseT/manager.py", line 116, in get_table_column_names
[Thu Feb 02 14:17:00 2012] [error] res = self.execute('''DESCRIBE %s'''%(table))#this function is so basic.. it should present no trouble, so no debug stuff.
[Thu Feb 02 14:17:00 2012] [error] File "/usr/local/lib/python2.7/site-packages/warehouseT/manager.py", line 91, in execute
[Thu Feb 02 14:17:00 2012] [error] self.cursor.execute(cmd)#this may raise warnings which we need to catch
[Thu Feb 02 14:17:00 2012] [error] File "build/bdist.linux-x86_64/egg/MySQLdb/cursors.py", line 174, in execute
[Thu Feb 02 14:17:00 2012] [error] self.errorhandler(self, exc, value)
[Thu Feb 02 14:17:00 2012] [error] File "build/bdist.linux-x86_64/egg/MySQLdb/connections.py", line 36, in defaulterrorhandler
[Thu Feb 02 14:17:00 2012] [error] raise errorclass, errorvalue
[Thu Feb 02 14:17:00 2012] [error] OperationalError: (2006, 'MySQL server has gone away')
(Looking at the timestamp of this piece of log, I'm guessing that the django error-site pointed to the moment the request was done, and not to the time the error was raised)
So basically this is an error in my own piece of code (that, again, has worked lots and lots f times without fail), that gets stopped by mysql.
I think the mysql log looks normal. I did look over the lines before the ones I'm posting, but I didn't recognise any strange stuff, but I don't read them daily. Logging in locally to the mysql-sever using the root user as well as different users presented no problem of any kind. All the things I expected to be able to do I could do. I did not try anything as user www-data as I frankly don't know how to log in. Also, django uses another username to log in to the mysql-server.
Restarting the mysql-server, no result. Restarting apache2 and django development server as well as mysql-server, no result. Half an hour later, everything was fine..
I realize I'm giving you almost nothing to go on. I would love to post some more, but as I mentioned I closed the Django error page. And I can't reproduce this error, which is kinda frightening. This error happened consistently for at least a quarter of an hour, but maybe three quarters of an hour. Then every thing worked again and I don't know why. To my untrained eye, there is no clue to what changed in that time, both in apache2 and the database..
Oh and yeah I googled this, found something about wait_timeout, found it's value was probably high enough (default), and only stuff about mysql going away after much idleness and than everything being fine after refreshing the page.
Can anyone help me with why this happened and what to do to protect against this 2006 error?
Upvotes: 2
Views: 1900
Reputation: 115
So at various sites I found this error occurs when a connection is open for a long time. I ignored this because I thought this wasn't applicable to my situation. After much deliberation I rewrote code to automatically reconnect when this error occurred and then retry, but it didn't feel right. Then I discovered that my code did keep an open connection for a long time, without it being necessary per se. The object(s) I used for database-access were instantiated at various module levels, so I guess they were created only when I restarted apache. No wonder the connection went a way a few days later..
So now I create the database objects only inside the functions needing them and haven't had a problem with error 2006 since :D. The problem was between chair and keyboard all along..
Upvotes: 2