boto3 S3 Object Parsing

Question

I'm trying to write a Python script for processing audio data stored on S3.

I have an S3 object which I'm calling using

def grabAudio(filename, directory):

     obj = s3client.get_object(Bucket=bucketname, Key=directory+'/'+filename)

return obj['Body'].read()

Accessing the data using

print(obj['Body'].read())

yields the correct audio information. So its accessing the data from the bucket just fine.

When I try to then use this data in my audio processing library (pydub), it fails:

audio = AudioSegment.from_wav(grabAudio(filename, bucketname))

Traceback (most recent call last): File "split_audio.py", line 38, in audio = AudioSegment.from_wav(grabAudio(filename, bucketname)) File "C:\Users\jmk_m\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pydub\audio_segment.py", line 544, in from_wav return cls.from_file(file, 'wav', parameters) File "C:\Users\jmk_m\AppData\Local\Programs\Python\Python36-32\lib\site-packages\pydub\audio_segment.py", line 456, in from_file file.seek(0) AttributeError: 'bytes' object has no attribute 'seek'

What is the format of the object coming in from s3? Byte array I presume? If so, is there a way of parsing it into a .wav format without having to save to disk? I'm trying to refrain from saving to disk.

Also open to alternative audio processing libraries.

boto3 S3 Object Parsing

Answers (1)

Related Questions