UrmLmn
UrmLmn

Reputation: 1257

IOError: [Errno 21] Is a directory: '/tmp/speech_dataset/'

I'm following the speech recognition tutorial from TensorFlow(link: https://www.tensorflow.org/versions/master/tutorials/audio_recognition#advanced_training), and when I'm running the following command, which downloads the dataset provided by TensorFlow, it runs perfectly.

python tensorflow/examples/speech_commands/train.py

However, when I'm changing the defaults, so that it points to my dataset, it throws the following error:

Traceback (most recent call last):
  File "/home/users2/lmn/.local/lib/python2.7/site-packages/tensorflow/examples/speech_commands/train.py", line 428, in <module>
    tf.app.run(main=main, argv=[sys.argv[0]] + unparsed)
  File "/home/users2/lmn/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 126, in run
    _sys.exit(main(argv))
  File "/home/users2/lmn/.local/lib/python2.7/site-packages/tensorflow/examples/speech_commands/train.py", line 106, in main
    FLAGS.testing_percentage, model_settings)
  File "/home/users2/lmn/.local/lib/python2.7/site-packages/tensorflow/examples/speech_commands/input_data.py", line 158, in __init__
    self.maybe_download_and_extract_dataset(data_url, data_dir)
  File "/home/users2/lmn/.local/lib/python2.7/site-packages/tensorflow/examples/speech_commands/input_data.py", line 204, in maybe_download_and_extract_dataset
    tarfile.open(filepath, 'r:gz').extractall(dest_directory)
  File "/usr/lib64/python2.7/tarfile.py", line 1693, in open
    return func(name, filemode, fileobj, **kwargs)
  File "/usr/lib64/python2.7/tarfile.py", line 1740, in gzopen
    fileobj = gzip.GzipFile(name, mode, compresslevel, fileobj)
  File "/usr/lib64/python2.7/gzip.py", line 94, in __init__
    fileobj = self.myfileobj = __builtin__.open(filename, mode or 'rb')
IOError: [Errno 21] Is a directory: '/tmp/speech_dataset/' 

The command I'm running is:

python tensorflow/examples/speech_commands/train.py --data_url=path/to/data/ --sample_rate=20000 --wanted_words=one,two,three,four,five,six,seven,eight,nine

Now, the error says that '/tmp/speech_dataset/' is a directory, but it is expecting a file, I guess. When I looked at train.py file, found the following code:

parser.add_argument(
      '--data_dir',
      type=str,
      default='/tmp/speech_dataset/',
      help="""\
      Where to download the speech training data to.
      """)

The --data-dir argument defines where the files from the downloaded dataset should be stored. However, I'm not changing at at all, nor does the code need to save any data, since I already have the data on my computer, where I define them at --data-url argument. It seems to me that this is a bug from TensorFlow.

Does anyone has experience with speech recognition on TensorFlow and know where the problem might be?

Thank you in advance!

Upvotes: 1

Views: 1268

Answers (1)

UrmLmn
UrmLmn

Reputation: 1257

OK, I solved the problem, so I'm posting it here if anyone runs to the same problem.

There was some confusion with the TensorFlow documentation. I thought that the --data-url argument should get the path to my data set, but this argument should only be used whenever you want to download some data set from somewhere. In case where you have your own data set, you need to explicitly define it as blank, i.e. give the following to your command --data-url= and --data-dir should then be the path to your data set.

Upvotes: 1

Related Questions