user1322801
user1322801

Reputation: 859

Tensorflow Speech Recognition, run sess.run fails with "could not convert string to float"

I have trained a model according to the instructions at: https://www.tensorflow.org/tutorials/sequences/audio_recognition#training

I have got a ckpt file but wasnt able to freeze it and generate a PB using tensorflow's official instructions.

In order to create the PB file I have converted the pbtxt, using the following converter: https://github.com/irfansharif/tensorflow/blob/master/converter.py

now when running the Tensorflow official "label_wav_file" I am getting the following error:

2018-08-02 10:15:12.263821: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA Traceback (most recent call last): File "label_wav.py", line 134, in tf.app.run(main=main, argv=[sys.argv[0]] + unparsed) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 126, in run _sys.exit(main(argv)) File "label_wav.py", line 106, in main FLAGS.output_name, FLAGS.how_many_labels) File "label_wav.py", line 100, in label_wav run_graph(wav_data, labels_list, input_name, output_name, how_many_labels) File "label_wav.py", line 68, in run_graph predictions, = sess.run(softmax_tensor, {input_layer_name: wav_data}) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 900, in run run_metadata_ptr) File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1104, in _run np_val = np.asarray(subfeed_val, dtype=subfeed_dtype) File "/home/user/.local/lib/python3.5/site-packages/numpy/core/numeric.py", line 531, in asarray return array(a, dtype, copy=False, order=order) ValueError: could not convert string to float: b'RIFF$}\x00\x00WAVEfmt \x10\x00\x00\x00\x01\x00\x01\x00

I want sure about the layers that i am suppose to use as my input and output layer(I am suspecting this is as the root for the problem)

Input Layer: data/Mul:0 Output Layer: ArgMax:0

(I have selected the following as Output layer as it was set as such in the pbtxt file)

I have the following layers exist in my graph: my network structure from tensorboard

any ideas what are the correct layers or what am i doing wrong?

Upvotes: 0

Views: 276

Answers (1)

GPhilo
GPhilo

Reputation: 19143

In your traceback:

ValueError: could not convert string to float: b'RIFF$}\x00\x00WAVEfmt \x10\x00\x00\x00\x01\x00\x01\x00

You're trying to feed your network a file read as a string of bytes, that will not work. The layer you chose as input expects a float matrix of some shape, which is not what you're dealing with. You'll need to study the network architecture to understand how data is passed in and what's the input preprocessing you need to do in order to be able to feed the data in the network

Upvotes: 0

Related Questions