Reputation: 671
I'm following this guide on how to set up docker with timescale/wale for continuous archiving.
Everything runs as expected, but when I get to the final step, I'm seeing:
written to stdout
2022-04-13 07:43:33.349 UTC [27] LOG: redo done at 0/50000F8
Connecting to wale (172.18.0.3:80)
writing to stdout
- 100% |********************************| 36 0:00:00 ETA
written to stdout
Connecting to wale (172.18.0.3:80)
wget: server returned error: HTTP/1.0 500 INTERNAL SERVER ERROR
2022-04-13 07:43:34.264 UTC [27] LOG: selected new timeline ID: 2
2022-04-13 07:43:34.282 UTC [27] LOG: archive recovery complete
Connecting to wale (172.18.0.3:80)
wget: server returned error: HTTP/1.0 500 INTERNAL SERVER ERROR
2022-04-13 07:43:34.838 UTC [27] LOG: could not open file "pg_wal/000000010000000000000006": Permission denied
2022-04-13 07:43:34.844 UTC [1] LOG: database system is ready to accept connections
It looks like the wget to wale is failing? It's connected to the same network as timescaledb_recovered so shouldn't it work? is there some additional config that the docs are missing? Or am I misreading these logs somehow?
Some additional error output from wale log:
['wal-e', '--terse', 'wal-push', '/var/lib/postgresql/data/pg_wal/000000010000000000000012']
Pushing wal file /var/lib/postgresql/data/pg_wal/000000010000000000000012: ['wal-e', '--terse', 'wal-push', '/var/lib/postgresql/data/pg_wal/000000010000000000000012']
172.18.0.2 - - [13/Apr/2022 14:09:17] "GET /wal-push/000000010000000000000012 HTTP/1.1" 200 -
['wal-e', '--terse', 'wal-fetch', '-p=0', '000000010000000000000011', '/var/lib/postgresql/data/pg_wal/000000010000000000000011']
Fetching wal 000000010000000000000011: ['wal-e', '--terse', 'wal-fetch', '-p=0', '000000010000000000000011', '/var/lib/postgresql/data/pg_wal/000000010000000000000011']
172.18.0.4 - - [13/Apr/2022 14:09:53] "GET /wal-fetch/000000010000000000000011 HTTP/1.1" 200 -
['wal-e', '--terse', 'wal-fetch', '-p=0', '000000010000000000000012', '/var/lib/postgresql/data/pg_wal/000000010000000000000012']
Fetching wal 000000010000000000000012: ['wal-e', '--terse', 'wal-fetch', '-p=0', '000000010000000000000012', '/var/lib/postgresql/data/pg_wal/000000010000000000000012']
172.18.0.4 - - [13/Apr/2022 14:09:54] "GET /wal-fetch/000000010000000000000012 HTTP/1.1" 200 -
['wal-e', '--terse', 'wal-fetch', '-p=0', '000000010000000000000011', '/var/lib/postgresql/data/pg_wal/000000010000000000000011']
Fetching wal 000000010000000000000011: ['wal-e', '--terse', 'wal-fetch', '-p=0', '000000010000000000000011', '/var/lib/postgresql/data/pg_wal/000000010000000000000011']
172.18.0.4 - - [13/Apr/2022 14:09:54] "GET /wal-fetch/000000010000000000000011 HTTP/1.1" 200 -
['wal-e', '--terse', 'wal-fetch', '-p=0', '00000002.history', '/var/lib/postgresql/data/pg_wal/00000002.history']
Fetching wal 00000002.history: ['wal-e', '--terse', 'wal-fetch', '-p=0', '00000002.history', '/var/lib/postgresql/data/pg_wal/00000002.history']
lzop: short read
wal_e.main CRITICAL MSG: An unprocessed exception has avoided all error handling
DETAIL: Traceback (most recent call last):
File "/usr/lib/python3.5/site-packages/wal_e/cmd.py", line 657, in main
args.prefetch)
File "/usr/lib/python3.5/site-packages/wal_e/operator/backup.py", line 353, in wal_restore
self.gpg_key_id is not None)
File "/usr/lib/python3.5/site-packages/wal_e/worker/worker_util.py", line 58, in do_lzop_get
return blobstore.do_lzop_get(creds, url, path, decrypt, do_retry=do_retry)
File "/usr/lib/python3.5/site-packages/wal_e/blobstore/file/file_util.py", line 52, in do_lzop_get
raise exc
File "/usr/lib/python3.5/site-packages/wal_e/blobstore/file/file_util.py", line 64, in write_and_return_error
key.get_contents_to_file(stream)
File "/usr/lib/python3.5/site-packages/wal_e/blobstore/file/calling_format.py", line 53, in get_contents_to_file
with open(self.path, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/backups/wal_005/00000002.history.lzo'
STRUCTURED: time=2022-04-13T14:09:55.216294-00 pid=32
Failed to fetch wal 00000002.history: None
Exception on /wal-fetch/00000002.history [GET]
Traceback (most recent call last):
File "/usr/lib/python3.5/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/usr/lib/python3.5/site-packages/flask/app.py", line 1816, in full_dispatch_request
return self.finalize_request(rv)
File "/usr/lib/python3.5/site-packages/flask/app.py", line 1831, in finalize_request
response = self.make_response(rv)
File "/usr/lib/python3.5/site-packages/flask/app.py", line 1957, in make_response
'The view function did not return a valid response. The'
TypeError: The view function did not return a valid response. The function either returned None or ended without a return statement.
172.18.0.4 - - [13/Apr/2022 14:09:55] "GET /wal-fetch/00000002.history HTTP/1.1" 500 -
['wal-e', '--terse', 'wal-fetch', '-p=0', '00000001.history', '/var/lib/postgresql/data/pg_wal/00000001.history']
Fetching wal 00000001.history: ['wal-e', '--terse', 'wal-fetch', '-p=0', '00000001.history', '/var/lib/postgresql/data/pg_wal/00000001.history']
lzop: short read
wal_e.main CRITICAL MSG: An unprocessed exception has avoided all error handling
DETAIL: Traceback (most recent call last):
File "/usr/lib/python3.5/site-packages/wal_e/cmd.py", line 657, in main
args.prefetch)
File "/usr/lib/python3.5/site-packages/wal_e/operator/backup.py", line 353, in wal_restore
self.gpg_key_id is not None)
File "/usr/lib/python3.5/site-packages/wal_e/worker/worker_util.py", line 58, in do_lzop_get
return blobstore.do_lzop_get(creds, url, path, decrypt, do_retry=do_retry)
File "/usr/lib/python3.5/site-packages/wal_e/blobstore/file/file_util.py", line 52, in do_lzop_get
raise exc
File "/usr/lib/python3.5/site-packages/wal_e/blobstore/file/file_util.py", line 64, in write_and_return_error
key.get_contents_to_file(stream)
File "/usr/lib/python3.5/site-packages/wal_e/blobstore/file/calling_format.py", line 53, in get_contents_to_file
with open(self.path, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/backups/wal_005/00000001.history.lzo'
STRUCTURED: time=2022-04-13T14:09:55.689548-00 pid=38
Failed to fetch wal 00000001.history: None
Exception on /wal-fetch/00000001.history [GET]
Traceback (most recent call last):
File "/usr/lib/python3.5/site-packages/flask/app.py", line 2292, in wsgi_app
response = self.full_dispatch_request()
File "/usr/lib/python3.5/site-packages/flask/app.py", line 1816, in full_dispatch_request
return self.finalize_request(rv)
File "/usr/lib/python3.5/site-packages/flask/app.py", line 1831, in finalize_request
response = self.make_response(rv)
File "/usr/lib/python3.5/site-packages/flask/app.py", line 1957, in make_response
'The view function did not return a valid response. The'
TypeError: The view function did not return a valid response. The function either returned None or ended without a return statement.
172.18.0.4 - - [13/Apr/2022 14:09:55] "GET /wal-fetch/00000001.history HTTP/1.1" 500 -
I've added some additional logs from the wale container that create the error message on running timescaledb-recovered. I'm guessing that there is some issue with the requests timescaledb-recovered is sending because wget works until that continer is started.
Upvotes: 0
Views: 675
Reputation: 671
This is bizarre, but apparently the critical failure and 500 error are intended to lets postgres know no further segments need to be recovered. Incredibly frustrating.
Upvotes: 0