5pence
5pence

Reputation: 175

AWS CLI command for querying S3 bucket

I have a bucket which I will call some-images for showing here. I am trying to get my command to return all objects that have a key value which starts_with a number. Here is the code I am using:

cmd = "aws s3api list-objects --bucket some-images --query \"Contents[?starts_with(Key, '10000'\""
push = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
print(push.returncode) 

I am getting 'None' back and I know that I there are objects there.

What I am trying to do eventually is return a JSON object array of matched files. The code below has the desired output but it's too slow

import boto3
import json

urls=[]
s3=boto3.resource('s3')
bucket=s3.Bucket('some-images')
for obj in bucket.objects.all():
    if obj.key.startswith('100001'):
        urls.append(obj.key)
        json_ready=[{"url":u} for u in zip(urls)]
print(json.dumps(json_ready))

output for python code: [{"url": ["100001.JPG"]}]

Thank you in advance

Upvotes: 1

Views: 7666

Answers (2)

Julian Tellez
Julian Tellez

Reputation: 852

I would recommend s3 select. it is very capable and potentially cheaper than running multiple requests to retrieve data.

It essentially lets you query s3 as though it was a database (s3 is sort of a database but that's a diff discussion).

Someone has written a gist that illustrates how to use s3 select here.

Upvotes: 1

raevilman
raevilman

Reputation: 3259

See if following AWS CLI command works for you

aws s3 ls s3://bucket_name/key 

enter image description here

Read more about 'aws s3 ls' here

[Update:1]

To use low level s3api
you can use the following command

aws s3api list-objects-v2 --bucket my_images_bucket --max-items 10 --prefix Ariba_ --output json

Read more about 'aws s3api list-objects-v2' here

Upvotes: 3

Related Questions