Reputation: 25367
I am trying to read wav files from a tarfile which is located in a bucket. Since there are a lot of files I do not want to extract those files first.
Instead, I would like to read the data from the tarfile and stream it to wavfile.read
(from scipy.io
)
with tf.gfile.Open(chunk_fp, mode='rb') as f:
with tarfile.open(fileobj=f, mode='r|*') as tar:
for member in ds_text.index.values:
bytes = BytesIO(tar.extractfile(member)) # Obviously not working
rate, wav_data = wavfile.read(bytes)
# Do stuff with data ..
However, I am not able to get my hands on a steam for wavfile.read
to work on.
Trying different things gets me different errors:
tar.extractfile(member).seek(0)
{AttributeError}'_Stream' object has no attribute 'seekable'
tar.extractfile(member).raw.read()
{StreamError}seeking backwards is not allowed
and so on.
Any ideas how I can achieve this?
Upvotes: 2
Views: 1471
Reputation: 25367
It turns out that I just opened the file in the wrong mode. Using r:*
instead of r|*
works:
with tarfile.open(fileobj=f, mode='r:*') as tar:
Upvotes: 2