Reputation: 41
I'm running the Cloud Composer environment with Composer version 1.18.7 and Airflow version 1.10.15. Based on Google Cloud Platforms documentation the error message "(_mysql_exceptions.OperationalError) (2006, "Lost connection to MySQL server at 'reading initial communication packet', system error: 0")" indicates that Airflow database is under heavy load: (https://cloud.google.com/composer/docs/how-to/using/troubleshooting-dags#symptoms_of_airflow_database_being_under_heavy_load).
I tried the solutions suggested in the above link (db maintenance dag, upgraded Cloud SQL Instance to bigger one, from default one to db-n1-standard-4, 4 vCPU, 15 GB memory). Unfortunately these had no effect on the issue and I still get lot of these errors on daily basis and they appear quite randomly. I have no idea what to do from now on as I can't find any other solutions anywhere. The Airflow database is not anywhere near full, as running this one from the AdHoc Query from Airflow UI with airflow_db as choice from dropdown:
SELECT table_name AS "Table",
ROUND(((data_length + index_length) / 1024 / 1024), 2) AS "Size (MB)"
FROM information_schema.TABLES
WHERE table_schema = "composer-1-18-7-airflow-1-10-15-xxxxx"
ORDER BY (data_length + index_length) DESC;
(redacted some information) gives me this: results of query. As you can see there is no indication that the table is full by any means. As a side note I did the Composer image upgrade recently as the support was ending for previous version. I'm also using Airflow mostly with Python operators, but also some BashOperator tasks fail too with same error message. Here is more detailed information about the error message:
[2022-05-19 10:11:27,820] {taskinstance.py:1152} ERROR - (_mysql_exceptions.OperationalError) (2006, "Lost connection to MySQL server at 'reading initial communication packet', system error: 0")
(Background on this error at: http://sqlalche.me/e/13/e3q8)
Traceback (most recent call last):
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2336, in _wrap_pool_connect
return fn()
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 364, in connect
return _ConnectionFairy._checkout(self)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 778, in _checkout
fairy = _ConnectionRecord.checkout(pool)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 495, in checkout
rec = pool._do_get()
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/pool/impl.py", line 241, in _do_get
return self._create_connection()
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 309, in _create_connection
return _ConnectionRecord(self)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 440, in __init__
self.__connect(first_connect_check=True)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 661, in __connect
pool.logger.debug("Error on connect(): %s", e)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
compat.raise_(
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 656, in __connect
connection = pool._invoke_creator(self)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/engine/strategies.py", line 114, in connect
return dialect.connect(*cargs, **cparams)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 493, in connect
return self.dbapi.connect(*cargs, **cparams)
File "/opt/python3.8/lib/python3.8/site-packages/MySQLdb/__init__.py", line 85, in Connect
return Connection(*args, **kwargs)
File "/opt/python3.8/lib/python3.8/site-packages/MySQLdb/connections.py", line 208, in __init__
super(Connection, self).__init__(*args, **kwargs2)
_mysql_exceptions.OperationalError: (2006, "Lost connection to MySQL server at 'reading initial communication packet', system error: 0")
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/airflow/airflow/models/taskinstance.py", line 968, in _run_raw_task
RTIF.write(RTIF(ti=self, render_templates=False))
File "/usr/local/lib/airflow/airflow/utils/db.py", line 74, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/airflow/airflow/models/renderedtifields.py", line 90, in write
session.merge(self)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 2162, in merge
return self._merge(
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 2240, in _merge
merged = self.query(mapper.class_).get(key[1])
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 1018, in get
return self._get_impl(ident, loading.load_on_pk_identity)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 1135, in _get_impl
return db_load_fn(self, primary_key_identity)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/orm/loading.py", line 286, in load_on_pk_identity
return q.one()
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3490, in one
ret = self.one_or_none()
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3459, in one_or_none
ret = list(self)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3535, in __iter__
return self._execute_and_instances(context)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3556, in _execute_and_instances
conn = self._get_bind_args(
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3571, in _get_bind_args
return fn(
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3550, in _connection_from_session
conn = self.session.connection(**kw)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 1138, in connection
return self._connection_for_bind(
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 1146, in _connection_for_bind
return self.transaction._connection_for_bind(
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 433, in _connection_for_bind
conn = bind._contextual_connect()
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2302, in _contextual_connect
self._wrap_pool_connect(self.pool.connect, None),
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2339, in _wrap_pool_connect
Connection._handle_dbapi_exception_noconnection(
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1583, in _handle_dbapi_exception_noconnection
util.raise_(
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2336, in _wrap_pool_connect
return fn()
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 364, in connect
return _ConnectionFairy._checkout(self)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 778, in _checkout
fairy = _ConnectionRecord.checkout(pool)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 495, in checkout
rec = pool._do_get()
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/pool/impl.py", line 241, in _do_get
return self._create_connection()
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 309, in _create_connection
return _ConnectionRecord(self)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 440, in __init__
self.__connect(first_connect_check=True)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 661, in __connect
pool.logger.debug("Error on connect(): %s", e)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
compat.raise_(
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/pool/base.py", line 656, in __connect
connection = pool._invoke_creator(self)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/engine/strategies.py", line 114, in connect
return dialect.connect(*cargs, **cparams)
File "/opt/python3.8/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 493, in connect
return self.dbapi.connect(*cargs, **cparams)
File "/opt/python3.8/lib/python3.8/site-packages/MySQLdb/__init__.py", line 85, in Connect
return Connection(*args, **kwargs)
File "/opt/python3.8/lib/python3.8/site-packages/MySQLdb/connections.py", line 208, in __init__
super(Connection, self).__init__(*args, **kwargs2)
sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (2006, "Lost connection to MySQL server at 'reading initial communication packet', system error: 0")
(Background on this error at: http://sqlalche.me/e/13/e3q8)
I would really appreciate any ideas or solutions how to fix this pesky error!
Upvotes: 1
Views: 1056
Reputation: 41
I got this sorted out. I found out that the Airflow scheduler was for some reason putting up a lot of pressure/load on Cloud SQL instance for some unknown reason. I restarted the scheduler and it has been smooth ride since then and no MySQL errors.
I have no idea what caused the scheduler to put unnecessary load on Cloud SQL instance, but this appeared same time as I updated the Composer image.
So if you have similar issues with Google Cloud Composer, you can start by restarting the scheduler. I spent way too much time debugging the issue with so easy solution and I hope nobody else wastes so much time on such clear solution
Upvotes: 3