Reputation: 85
I have a use case where I have to read tables names under a folder on an Amazon S3 given a path.
e.g say a bucket with path s3://mybucket/aws glue service/raw/source_data/
in source data there's a folder named Tables that list table names. eg.
Tables:
so basically I want to write a function that returns ["users","customers","Admin"]
Here's what I have so far:
def read_tables(path):
tables = []
s3 = boto3.resource('s3')
bucket = s3.Bucket(path)
for obj in bucket.objects.filter(Prefix='Tables/'):
tables.append(obj)
return tables
Upvotes: 2
Views: 1119
Reputation: 38982
The table name will be at the end of the path of the object key and can be extracted as follows:
def read_tables(s3_uri):
tables = []
s3 = boto3.resource('s3')
remove_scheme = slice(5, len(s3_uri))
bucketname, key = s3_uri[remove_scheme].split('/', 1)
bucket = s3.Bucket(bucketname)
prefix = f'{key}/Tables/'
for obj in bucket.objects.filter(Prefix=prefix):
tablename = obj.key.split('/').pop()
tables.append(tablename)
return tables
Upvotes: 2