user3476582
user3476582

Reputation: 85

Read a list of table names on an Amazon S3 folder under a named folder/directory using boto3

I have a use case where I have to read tables names under a folder on an Amazon S3 given a path. e.g say a bucket with path s3://mybucket/aws glue service/raw/source_data/

in source data there's a folder named Tables that list table names. eg.

Tables:

so basically I want to write a function that returns ["users","customers","Admin"]

Here's what I have so far:

def  read_tables(path):
    tables = []
    s3 = boto3.resource('s3')
    bucket = s3.Bucket(path)
    for obj in bucket.objects.filter(Prefix='Tables/'):
        tables.append(obj)
    return tables

Upvotes: 2

Views: 1119

Answers (1)

Oluwafemi Sule
Oluwafemi Sule

Reputation: 38982

The table name will be at the end of the path of the object key and can be extracted as follows:

def read_tables(s3_uri):
    tables = []
    s3 = boto3.resource('s3')
    remove_scheme = slice(5, len(s3_uri))
    bucketname, key = s3_uri[remove_scheme].split('/', 1) 
    bucket = s3.Bucket(bucketname)
    prefix = f'{key}/Tables/'

    for obj in bucket.objects.filter(Prefix=prefix):
        tablename = obj.key.split('/').pop()
        tables.append(tablename)
    return tables

Upvotes: 2

Related Questions