Lei Chen
Lei Chen

Reputation: 142

Google Cloud Storage batch move file failure. "Connection reset by peer"

I suspect the code below ran out connection capacity, etc. Is there any interface I can send batch requests with? Or sleep a few ms?

def archive_pending_blobs(bucket: Bucket, blobs: typing.List[Blob], pending_prefix: str,
                          loaded_prefix: str) -> None:
    """Archive pending blobs to loaded prefix."""
    try:
        for b in blobs:
            bucket.copy_blob(b, bucket, b.name.replace(pending_prefix, loaded_prefix))
        bucket.delete_blobs(blobs)
    except Exception as e:
        print('gcs achieving error for path: {} err: {}'.format(pending_prefix, e))
        raise e

Traceback (most recent call last): File "/env/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen chunked=chunked) File "/env/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 384, in _make_request six.raise_from(e, None) File "", line 2, in raise_from File "/env/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 380, in _make_request httplib_response = conn.getresponse() File "/opt/python3.7/lib/python3.7/http/client.py", line 1321, in getresponse response.begin() File "/opt/python3.7/lib/python3.7/http/client.py", line 296, in begin version, status, reason = self._read_status() File "/opt/python3.7/lib/python3.7/http/client.py", line 257, in _read_status line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1") File "/opt/python3.7/lib/python3.7/socket.py", line 589, in readinto return self._sock.recv_into(b) File "/opt/python3.7/lib/python3.7/ssl.py", line 1049, in recv_into return self.read(nbytes, buffer) File "/opt/python3.7/lib/python3.7/ssl.py", line 908, in read return self._sslobj.read(len, buffer) ConnectionResetError: [Errno 104] Connection reset by peer During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/env/local/lib/python3.7/site-packages/requests/adapters.py", line 445, in send timeout=timeout File "/env/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 638, in urlopen _stacktrace=sys.exc_info()[2]) File "/env/local/lib/python3.7/site-packages/urllib3/util/retry.py", line 367, in increment raise six.reraise(type(error), error, _stacktrace) File "/env/local/lib/python3.7/site-packages/urllib3/packages/six.py", line 685, in reraise raise value.with_traceback(tb) File "/env/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen chunked=chunked) File "/env/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 384, in _make_request six.raise_from(e, None) File "", line 2, in raise_from File "/env/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 380, in _make_request httplib_response = conn.getresponse() File "/opt/python3.7/lib/python3.7/http/client.py", line 1321, in getresponse response.begin() File "/opt/python3.7/lib/python3.7/http/client.py", line 296, in begin version, status, reason = self._read_status() File "/opt/python3.7/lib/python3.7/http/client.py", line 257, in _read_status line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1") File "/opt/python3.7/lib/python3.7/socket.py", line 589, in readinto return self._sock.recv_into(b) File "/opt/python3.7/lib/python3.7/ssl.py", line 1049, in recv_into return self.read(nbytes, buffer) File "/opt/python3.7/lib/python3.7/ssl.py", line 908, in read return self._sslobj.read(len, buffer) urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/user_code/main.py", line 230, in bq_merge archive_pending_blobs(bucket, blobs[min_idx:max_idx], pending_prefix, loaded_prefix) File "/user_code/main.py", line 44, in archive_pending_blobs raise e File "/user_code/main.py", line 40, in archive_pending_blobs bucket.copy_blob(b, bucket, b.name.replace(pending_prefix, loaded_prefix)) File "/env/local/lib/python3.7/site-packages/google/cloud/storage/bucket.py", line 711, in copy_blob _target_object=new_blob, File "/env/local/lib/python3.7/site-packages/google/cloud/_http.py", line 290, in api_request headers=headers, target_object=_target_object) File "/env/local/lib/python3.7/site-packages/google/cloud/_http.py", line 183, in _make_request return self._do_request(method, url, headers, data, target_object) File "/env/local/lib/python3.7/site-packages/google/cloud/_http.py", line 212, in _do_request url=url, method=method, headers=headers, data=data) File "/env/local/lib/python3.7/site-packages/google/auth/transport/requests.py", line 201, in request method, url, data=data, headers=request_headers, **kwargs) File "/env/local/lib/python3.7/site-packages/requests/sessions.py", line 512, in request resp = self.send(prep, **send_kwargs) File "/env/local/lib/python3.7/site-packages/requests/sessions.py", line 622, in send r = adapter.send(request, **kwargs) File "/env/local/lib/python3.7/site-packages/requests/adapters.py", line 495, in send raise ConnectionError(err, request=request) requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer')) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/env/local/lib/python3.7/site-packages/google/cloud/functions_v1beta2/worker.py", li

Upvotes: 1

Views: 4749

Answers (1)

Guillermo Cacheda
Guillermo Cacheda

Reputation: 2232

Per this StackOverflow answer about Connection reset by peer, seems like this is a fatal error were the remote server sends a RST packet to inmediatly drop the connection.

This other SO answer tackles how to solve it. The solution given is using time.sleep but, as we discussed on the comments, didn't work in your case. That's why I'm suggesting a different approach by using truncated exponential backoff:

Truncated exponential backoff is a standard error handling strategy for network applications in which a client periodically retries a failed request with increasing delays between requests. [...]

Accessing Cloud Storage through a client library. Note that some client libraries, such as the Cloud Storage Client Library for Node.js, have built-in exponential backoff.

There's no built-in exponential backoff for Python, but there's an example of how handle retries in Python with this method.

Upvotes: 2

Related Questions