Reputation: 6154
I would like to know if a key exists in boto3. I can loop the bucket contents and check the key if it matches.
But that seems longer and an overkill. Boto3 official docs explicitly state how to do this.
May be I am missing the obvious. Can anybody point me how I can achieve this.
Upvotes: 372
Views: 413449
Reputation: 639
FWIW, here are the very simple functions that I am using
import boto3
def get_resource(config: dict={}):
"""Loads the s3 resource.
Expects AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY to be in the environment
or in a config dictionary.
Looks in the environment first."""
s3 = boto3.resource('s3',
aws_access_key_id=os.environ.get(
"AWS_ACCESS_KEY_ID", config.get("AWS_ACCESS_KEY_ID")),
aws_secret_access_key=os.environ.get("AWS_SECRET_ACCESS_KEY", config.get("AWS_SECRET_ACCESS_KEY")))
return s3
def get_bucket(s3, s3_uri: str):
"""Get the bucket from the resource.
A thin wrapper, use with caution.
Example usage:
>> bucket = get_bucket(get_resource(), s3_uri_prod)"""
return s3.Bucket(s3_uri)
def isfile_s3(bucket, key: str) -> bool:
"""Returns T/F whether the file exists."""
objs = list(bucket.objects.filter(Prefix=key))
return len(objs) == 1 and objs[0].key == key
def isdir_s3(bucket, key: str) -> bool:
"""Returns T/F whether the directory exists."""
objs = list(bucket.objects.filter(Prefix=key))
return len(objs) > 0
Upvotes: 8
Reputation: 1073
Now S3 support conditional writes.With conditional writes you can use additional headers to your write requests in order to add preconditions to your S3 operation. This can prevent overwrites of existing data. Conditional writes will validate there is no existing object with the same key name already in your bucket.
AWS CLI Example :
aws s3api put-object --bucket amzn-s3-demo-bucket --key dir-1/my_images.tar.bz2 --body my_images.tar.bz2 --if-none-match "*"
It can also be used with Boto3:
# Custom headers
headers = {
'ContentType': 'text/plain',
'CacheControl': 'max-age=3600',
'ContentDisposition': 'attachment',
'Metadata': {
'If-None-Match': '*',
}
}
# Upload the file with the custom headers
s3_client.put_object(
Bucket=bucket_name,
Key=object_key,
Body=open(file_path, 'rb'),
ContentType=headers['ContentType'],
CacheControl=headers['CacheControl'],
ContentDisposition=headers['ContentDisposition'],
Metadata=headers['Metadata']
)
More - https://docs.aws.amazon.com/AmazonS3/latest/userguide/conditional-requests.html
Upvotes: 0
Reputation: 8350
I'm not a big fan of using exceptions for control flow. This is an alternative approach that works in boto3:
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('my-bucket')
key = 'dootdoot.jpg'
objs = list(bucket.objects.filter(Prefix=key))
keys = {o.key for o in objs}
if path_s3 in keys:
print("Exists!")
else:
print("Doesn't exist")
Upvotes: 203
Reputation: 4213
The easiest way I found (and probably the most efficient) is this:
import boto3
import botocore
from botocore.errorfactory import ClientError
s3 = boto3.client('s3')
try:
s3.head_object(Bucket='bucket_name', Key='file_path')
except botocore.exceptions.ClientError as e:
if e.response['Error']['Code'] == "404":
# The key does not exist.
...
elif e.response['Error']['Code'] == 403:
# Unauthorized, including invalid bucket
...
else:
# Something else has gone wrong.
raise
Upvotes: 272
Reputation: 804
It's 2023 and none of the above worked for me. Here is the version that did for me:
import boto3
import botocore
s3 = boto3.client('s3')
try:
s3.head_object(Bucket='YOUR_BUCKET_NAME', Key=object_name)
except botocore.exceptions.ClientError as error:
if error.response['Error']['Code']:
print("Object does not exist!")
else:
print("Object exists!")
If you do have permission and everything is right with the request, in case the object does not exist, you will get 404. The request also returns 400 and 403 so if you want to be more specific with the error handling you can check those.
Upvotes: 6
Reputation: 334
Assuming you just want to check if a key exists (instead of quietly over-writing it), do this check first. Will also check for errors:
import boto3
def key_exists(mykey, mybucket):
s3_client = boto3.client('s3')
try:
response = s3_client.list_objects_v2(Bucket=mybucket, Prefix=mykey)
for obj in response['Contents']:
if mykey == obj['Key']:
return 'exists'
return False # no keys match
except KeyError:
return False # no keys found
except Exception as e:
# Handle or log other exceptions such as bucket doesn't exist
return e
key_check = key_exists('someprefix/myfile-abc123', 'my-bucket-name')
if key_check:
if key_check == 'exists':
print("key exists!")
else:
print(f"S3 ERROR: {e}")
else:
print("safe to put new bucket object")
# try:
# resp = s3_client.put_object(Body="Your string or file-like object",
# Bucket=mybucket,Key=mykey)
# ...check resp success and ClientError exception for errors...
Upvotes: 26
Reputation: 305
You can use awswrangler to do it in 1 line.
awswrangler.s3.does_object_exist(path_of_object_to_check)
https://aws-data-wrangler.readthedocs.io/en/stable/stubs/awswrangler.s3.does_object_exist.html
The does_object_exist method uses the head_object method of the s3 client and checks if there is a ClientError raised. If the error code is 404 than False is returned.
Upvotes: 1
Reputation: 13662
Using objects.filter
and checking the resultant list is the by far fastest way to check if a file exists in an S3 bucket. .
Use this concise oneliner, makes it less intrusive when you have to throw it inside an existing project without modifying much of the code.
s3_file_exists = lambda filename: bool(list(bucket.objects.filter(Prefix=filename)))
The above function assumes the bucket
variable was already declared.
You can extend the lambda to support additional parameter like
s3_file_exists = lambda filename, bucket: bool(list(bucket.objects.filter(Prefix=filename)))
Upvotes: 11
Reputation: 1707
This could check both prefix and key, and fetches at most 1 key.
def prefix_exits(bucket, prefix):
s3_client = boto3.client('s3')
res = s3_client.list_objects_v2(Bucket=bucket, Prefix=prefix, MaxKeys=1)
return 'Contents' in res
Upvotes: 32
Reputation: 859
It's really simple with get()
method
import botocore
from boto3.session import Session
session = Session(aws_access_key_id='AWS_ACCESS_KEY',
aws_secret_access_key='AWS_SECRET_ACCESS_KEY')
s3 = session.resource('s3')
bucket_s3 = s3.Bucket('bucket_name')
def not_exist(file_key):
try:
file_details = bucket_s3.Object(file_key).get()
# print(file_details) # This line prints the file details
return False
except botocore.exceptions.ClientError as e:
if e.response['Error']['Code'] == "NoSuchKey": # or you can check with e.reponse['HTTPStatusCode'] == '404'
return True
return False # For any other error it's hard to determine whether it exists or not. so based on the requirement feel free to change it to True/ False / raise Exception
print(not_exist('hello_world.txt'))
Upvotes: 0
Reputation: 2032
Just following the thread, can someone conclude which one is the most efficient way to check if an object exists in S3?
I think head_object might win as it just checks the metadata which is lighter than the actual object itself
Upvotes: 1
Reputation: 735
you can use Boto3 for this.
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('my-bucket')
objs = list(bucket.objects.filter(Prefix=key))
if(len(objs)>0):
print("key exists!!")
else:
print("key doesn't exist!")
Here key is the path you want to check exists or not
Upvotes: 15
Reputation: 2173
You can use S3Fs, which is essentially a wrapper around boto3 that exposes typical file-system style operations:
import s3fs
s3 = s3fs.S3FileSystem()
s3.exists('myfile.txt')
Upvotes: 24
Reputation: 11
Here is a solution that works for me. One caveat is that I know the exact format of the key ahead of time, so I am only listing the single file
import boto3
# The s3 base class to interact with S3
class S3(object):
def __init__(self):
self.s3_client = boto3.client('s3')
def check_if_object_exists(self, s3_bucket, s3_key):
response = self.s3_client.list_objects(
Bucket = s3_bucket,
Prefix = s3_key
)
if 'ETag' in str(response):
return True
else:
return False
if __name__ == '__main__':
s3 = S3()
if s3.check_if_object_exists(bucket, key):
print "Found S3 object."
else:
print "No object found."
Upvotes: 1
Reputation: 479
Try This simple
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('mybucket_name') # just Bucket name
file_name = 'A/B/filename.txt' # full file path
obj = list(bucket.objects.filter(Prefix=file_name))
if len(obj) > 0:
print("Exists")
else:
print("Not Exists")
Upvotes: 3
Reputation: 2715
I noticed that just for catching the exception using botocore.exceptions.ClientError
we need to install botocore. botocore takes up 36M of disk space. This is particularly impacting if we use aws lambda functions. In place of that if we just use exception then we can skip using the extra library!
The code looks like this. Please share your thoughts:
import boto3
import traceback
def download4mS3(s3bucket, s3Path, localPath):
s3 = boto3.resource('s3')
print('Looking for the csv data file ending with .csv in bucket: ' + s3bucket + ' path: ' + s3Path)
if s3Path.endswith('.csv') and s3Path != '':
try:
s3.Bucket(s3bucket).download_file(s3Path, localPath)
except Exception as e:
print(e)
print(traceback.format_exc())
if e.response['Error']['Code'] == "404":
print("Downloading the file from: [", s3Path, "] failed")
exit(12)
else:
raise
print("Downloading the file from: [", s3Path, "] succeeded")
else:
print("csv file not found in in : [", s3Path, "]")
exit(12)
Upvotes: 1
Reputation: 13046
If you seek a key that is equivalent to a directory then you might want this approach
session = boto3.session.Session()
resource = session.resource("s3")
bucket = resource.Bucket('mybucket')
key = 'dir-like-or-file-like-key'
objects = [o for o in bucket.objects.filter(Prefix=key).limit(1)]
has_key = len(objects) > 0
This works for a parent key or a key that equates to file or a key that does not exist. I tried the favored approach above and failed on parent keys.
Upvotes: 2
Reputation: 408
For boto3, ObjectSummary can be used to check if an object exists.
Contains the summary of an object stored in an Amazon S3 bucket. This object doesn't contain contain the object's full metadata or any of its contents
import boto3
from botocore.errorfactory import ClientError
def path_exists(path, bucket_name):
"""Check to see if an object exists on S3"""
s3 = boto3.resource('s3')
try:
s3.ObjectSummary(bucket_name=bucket_name, key=path).load()
except ClientError as e:
if e.response['Error']['Code'] == "404":
return False
else:
raise e
return True
path_exists('path/to/file.html')
Calls s3.Client.head_object to update the attributes of the ObjectSummary resource.
This shows that you can use ObjectSummary
instead of Object
if you are planning on not using get()
. The load()
function does not retrieve the object it only obtains the summary.
Upvotes: 1
Reputation: 707
import boto3
client = boto3.client('s3')
s3_key = 'Your file without bucket name e.g. abc/bcd.txt'
bucket = 'your bucket name'
content = client.head_object(Bucket=bucket,Key=s3_key)
if content.get('ResponseMetadata',None) is not None:
print "File exists - s3://%s/%s " %(bucket,s3_key)
else:
print "File does not exist - s3://%s/%s " %(bucket,s3_key)
Upvotes: 8
Reputation: 658
There is one simple way by which we can check if file exists or not in S3 bucket. We donot need to use exception for this
sesssion = boto3.Session(aws_access_key_id, aws_secret_access_key)
s3 = session.client('s3')
object_name = 'filename'
bucket = 'bucketname'
obj_status = s3.list_objects(Bucket = bucket, Prefix = object_name)
if obj_status.get('Contents'):
print("File exists")
else:
print("File does not exists")
Upvotes: 1
Reputation: 19695
Boto 2's boto.s3.key.Key
object used to have an exists
method that checked if the key existed on S3 by doing a HEAD request and looking at the the result, but it seems that that no longer exists. You have to do it yourself:
import boto3
import botocore
s3 = boto3.resource('s3')
try:
s3.Object('my-bucket', 'dootdoot.jpg').load()
except botocore.exceptions.ClientError as e:
if e.response['Error']['Code'] == "404":
# The object does not exist.
...
else:
# Something else has gone wrong.
raise
else:
# The object does exist.
...
load()
does a HEAD request for a single key, which is fast, even if the object in question is large or you have many objects in your bucket.
Of course, you might be checking if the object exists because you're planning on using it. If that is the case, you can just forget about the load()
and do a get()
or download_file()
directly, then handle the error case there.
Upvotes: 350
Reputation:
S3_REGION="eu-central-1"
bucket="mybucket1"
name="objectname"
import boto3
from botocore.client import Config
client = boto3.client('s3',region_name=S3_REGION,config=Config(signature_version='s3v4'))
list = client.list_objects_v2(Bucket=bucket,Prefix=name)
for obj in list.get('Contents', []):
if obj['Key'] == name: return True
return False
Upvotes: 1
Reputation: 14916
If you have less than 1000 in a directory or bucket you can get set of them and after check if such key in this set:
files_in_dir = {d['Key'].split('/')[-1] for d in s3_client.list_objects_v2(
Bucket='mybucket',
Prefix='my/dir').get('Contents') or []}
Such code works even if my/dir
is not exists.
http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Client.list_objects_v2
Upvotes: 1
Reputation: 14916
Not only client
but bucket
too:
import boto3
import botocore
bucket = boto3.resource('s3', region_name='eu-west-1').Bucket('my-bucket')
try:
bucket.Object('my-file').get()
except botocore.exceptions.ClientError as ex:
if ex.response['Error']['Code'] == 'NoSuchKey':
print('NoSuchKey')
Upvotes: 13
Reputation: 2267
In Boto3, if you're checking for either a folder (prefix) or a file using list_objects. You can use the existence of 'Contents' in the response dict as a check for whether the object exists. It's another way to avoid the try/except catches as @EvilPuppetMaster suggests
import boto3
client = boto3.client('s3')
results = client.list_objects(Bucket='my-bucket', Prefix='dootdoot.jpg')
return 'Contents' in results
Upvotes: 41
Reputation: 363
Check out
bucket.get_key(
key_name,
headers=None,
version_id=None,
response_headers=None,
validate=True
)
Check to see if a particular key exists within the bucket. This method uses a HEAD request to check for the existence of the key. Returns: An instance of a Key object or None
from Boto S3 Docs
You can just call bucket.get_key(keyname) and check if the returned object is None.
Upvotes: 0