Reputation: 312
when I running luigi tasks, sometimes will meet framework crash, cause the following tasks all failed. Here the error log info:
2017-10-05 22:02:02,564 luigi-interface WARNING Failed pinging scheduler
2017-10-05 22:02:03,129 requests.packages.urllib3.connectionpool INFO Starting new HTTP connection (126): localhost
2017-10-05 22:02:03,130 luigi-interface ERROR Failed connecting to remote scheduler 'http://localhost:8082'
Traceback (most recent call last):
...
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/requests/sessions.py", line 585, in send
r = adapter.send(request, **kwargs)
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/requests/adapters.py", line 467, in send
raise ConnectionError(e, request=request)
ConnectionError: HTTPConnectionPool(host='localhost', port=8082): Max retries exceeded with url: /api/add_worker (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f15128cb3d0>: Failed to establish a new connection: [Errno 111] Connection refused',))
2017-10-05 22:02:03,180 luigi-interface INFO Worker Worker(salt=150908931, workers=3, host=etl2, username=develop, pid=18019) was stopped. Shutting down Keep-Alive thread
Traceback (most recent call last):
File "app_metadata.py", line 1567, in <module>
luigi.run()
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/luigi/interface.py", line 210, in run
return _run(*args, **kwargs)['success']
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/luigi/interface.py", line 238, in _run
return _schedule_and_run([cp.get_task_obj()], worker_scheduler_factory)
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/luigi/interface.py", line 197, in _schedule_and_run
success &= worker.run()
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/luigi/worker.py", line 867, in run
self._add_worker()
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/luigi/worker.py", line 652, in _add_worker
self._scheduler.add_worker(self._id, self._worker_info)
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/luigi/rpc.py", line 219, in add_worker
return self._request('/api/add_worker', {'worker': worker, 'info': info})
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/luigi/rpc.py", line 146, in _request
page = self._fetch(url, body, log_exceptions, attempts)
File "/home/develop/data_warehouse/venv/local/lib/python2.7/site-packages/luigi/rpc.py", line 138, in _fetch
last_exception
luigi.rpc.RPCError: Errors (3 attempts) when connecting to remote scheduler 'http://localhost:8082'
sounds like try to ping central schedule, but be failed, then crashed, later tasks all be blocked, cannot run successfully.
and, some one else also meet the similar error, but his resolution not works. Github - Failed connecting to remote scheduler #1894
Upvotes: 2
Views: 3633
Reputation: 408
Have you configured the central scheduler properly? See the docs: https://luigi.readthedocs.io/en/stable/central_scheduler.html
If not, try using the local scheduler by specifying --local-scheduler
from the command line.
Upvotes: 0
Reputation: 8290
I would try making the timeout a little longer if your central scheduler is getting overloaded. You could also increase retries and retry wait time.
in luigi.cfg
[core]
rpc-connect-timeout=60.0 #default is 10.0
rpc-retry-attempts=10 #default is 3
rpc-retry-wait=60 #default is 30
You may also want to add a watch have the scheduler process automatically restart on crash.
Upvotes: 2