Read npy file directly from S3 StreamingBody

Question

I have many .npy (numpy) and .json files saved in s3, I need to download and load these files.

For the json files I am downloading the files and loading into dictionary using the following code:

obj = s3.Object(bucket, key)
response = obj.get()
body = response['Body'].read().decode('utf-8')
data = json.loads(body)

That way I avoid writing the .json file to disk and reading it.

Now I want to do the same for the .npy files, what would be the equivalent?

Currently I am downloading the file, saving to temp file on disk and loading with np.load but would like to avoid that if possible.

mtngld · Accepted Answer

Following https://stackoverflow.com/a/28196540/3509999:

obj = s3.Object(bucket, key)
with io.BytesIO(obj.get()["Body"].read()) as f:
    # rewind the file
    f.seek(0)
    arr = np.load(f)

Answers (2)