Kumar
Kumar

Reputation: 25

How to speedup s3 bucket object filter

I am using the following python code to filter required prefix and pattern, but due to the volume of files in the bucket, its taking too long to display the results and web server is timing out, there is any way i can speedup the process

my_list = []
my_bucket = s3.Bucket(bucket)
for line in my_bucket.objects.filter(Prefix=path):
     if re.search(SourceFilePattern, line.key):
          my_list.append(line.key)

Upvotes: 0

Views: 1267

Answers (1)

alexis-donoghue
alexis-donoghue

Reputation: 3387

You can use S3 Inventory to periodically create static CSV-like file with object metadata and then run your code against it. It will be faster than waiting for network I/O for each call.

If you want to process it real-time-ish and give responses from webserver, I'd rather populate some sort of DynamoDB table beforehand and query it from webserver process. You can react to S3 Put events and trigger lambda which will put data into DynamoDB if file matches the criteria.

Upvotes: 1

Related Questions