Reputation: 61
thank you for your help.
i am learning the tuturial (speech-command) on the tensorflow, after i download the code and the dataset, i run the program, after several steps of training, an ERROR come out.
InvalidArgumentError (see above for traceback): Data too short when trying to read string
[[Node: DecodeWav = DecodeWav[desired_channels=1, desired_samples=16000, _device="/job:localhost/replica:0/task:0/device:CPU:0"](ReadFile)]]
[[Node: DecodeWav/_21 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_4_DecodeWav", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]()]]
it seems that the decode method is something wrong, but i can not figure out what is wrong. and i didnot change anycode after i download it from github. Can you help me. thanks.
Upvotes: 2
Views: 1941
Reputation: 21
In my case I hadn't got any empty files. The function from the bottom of discussion helped me:
https://github.com/mozilla/DeepSpeech/issues/2048
import os
import wave
import pandas
import sys
def compare_header_and_size(wav_filename):
with wave.open(wav_filename, 'r') as fin:
header_fsize = (fin.getnframes() * fin.getnchannels() * fin.getsampwidth()) + 44
file_fsize = os.path.getsize(wav_filename)
return header_fsize != file_fsize
df = pandas.read_csv(sys.argv[1])
invalid = df.apply(lambda x: compare_header_and_size(x['wav_filename']), axis=1)
print('The following files are corrupted:')
print(df[invalid].values)
I found out that my audio files was different length while comparing 2 ways of measuring in the function presented.
The reason was that i added metadata to wav files while saving them in the Adobe Audition. That was a mistake
Upvotes: 1
Reputation: 61
Problem solved. the problem is one of the voice in the dataset is empty(0 kb), and the program use random to fetch the training data, if random to this empty training voice, it goes the ERROR in the question.
Upvotes: 4