rotsner
rotsner

Reputation: 692

Glue Boto Client -- NoCredentialsError

I've been running my Glue Jobs on a schedule for a few months. Last night my Glue Job failed due to botocore.exceptions.NoCredentialsError: Unable to locate credentials after calling bucket.objects.filter(Prefix=productionDirectory):

I am under the impression this is a result of not having defined a credentials file, but AWS Glue has always pulled credentials without issue. I just re-ran my job and everything worked perfectly. For reference, I define my Glue Client via: glue = boto3.client('glue'). Has anyone ever experienced this before? Is this just an edge-case?

Full Logs:

Traceback (most recent call last):
  File "/tmp/data-deployment", line 67, in <module>
    for obj in bucket.objects.filter(Prefix=productionDirectory):
  File "/home/spark/.local/lib/python3.7/site-packages/boto3/resources/collection.py", line 83, in __iter__
    for page in self.pages():
  File "/home/spark/.local/lib/python3.7/site-packages/boto3/resources/collection.py", line 166, in pages
    for page in pages:
  File "/home/spark/.local/lib/python3.7/site-packages/botocore/paginate.py", line 255, in __iter__
    response = self._make_request(current_kwargs)
  File "/home/spark/.local/lib/python3.7/site-packages/botocore/paginate.py", line 332, in _make_request
    return self._method(**current_kwargs)
  File "/home/spark/.local/lib/python3.7/site-packages/botocore/client.py", line 316, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/spark/.local/lib/python3.7/site-packages/botocore/client.py", line 613, in _make_api_call
    operation_model, request_dict, request_context)
  File "/home/spark/.local/lib/python3.7/site-packages/botocore/client.py", line 632, in _make_request
    return self._endpoint.make_request(operation_model, request_dict)
  File "/home/spark/.local/lib/python3.7/site-packages/botocore/endpoint.py", line 102, in make_request
    return self._send_request(request_dict, operation_model)
  File "/home/spark/.local/lib/python3.7/site-packages/botocore/endpoint.py", line 132, in _send_request
    request = self.create_request(request_dict, operation_model)
  File "/home/spark/.local/lib/python3.7/site-packages/botocore/endpoint.py", line 116, in create_request
    operation_name=operation_model.name)
  File "/home/spark/.local/lib/python3.7/site-packages/botocore/hooks.py", line 356, in emit
    return self._emitter.emit(aliased_event_name, **kwargs)
  File "/home/spark/.local/lib/python3.7/site-packages/botocore/hooks.py", line 228, in emit
    return self._emit(event_name, kwargs)
  File "/home/spark/.local/lib/python3.7/site-packages/botocore/hooks.py", line 211, in _emit
    response = handler(**kwargs)
  File "/home/spark/.local/lib/python3.7/site-packages/botocore/signers.py", line 90, in handler
    return self.sign(operation_name, request)
  File "/home/spark/.local/lib/python3.7/site-packages/botocore/signers.py", line 160, in sign
    auth.add_auth(request)
  File "/home/spark/.local/lib/python3.7/site-packages/botocore/auth.py", line 357, in add_auth
    raise NoCredentialsError
botocore.exceptions.NoCredentialsError: Unable to locate credentials

Edit/Update: This is a known bug. I've posted the mitigation strategy provided from AWS as an answer below.

Upvotes: 2

Views: 590

Answers (2)

rotsner
rotsner

Reputation: 692

Update: I reached out to AWS via Support and they responded. Apparently this is a known bug and issue. While they do not have a solution or ETA for solution, they do have a way to mitigate the issue. Information below:

Thank you for reporting your issue to us and product team is aware of this intermittent issue. 
They are working on resolution however, I do not have an ETA. 
To mitigate this issue, increase the timeout / attempts to meta service request in your code:

####START######

import os

####Increase meta service timeout and attempt########

os.environ['AWS_METADATA_SERVICE_NUM_ATTEMPTS'] ="5"
os.environ['AWS_METADATA_SERVICE_TIMEOUT'] ="30"

#####################END#################

Upvotes: 4

amsh
amsh

Reputation: 3387

I faced a similar issue with Glue, but not exactly the same.

We used external tables with SparkSQL and S3, and sometimes an Exception was raised out of nowhere, i.e. Table not found. The issue was never reproduced on testing and had least frequency. Since our jobs ran perfectly fine on retries, we enabled the retry mechanism to solve it.

It has something to do with the internal workings of Glue and its serverless environment.

Upvotes: 2

Related Questions