Tom
Tom

Reputation: 1261

Mixing Queue Ops with regular Python for data loading

I am working on sequence learning for audio and need to load my audio data. Currently I rely on tf.decode_csv to load pairs of filenames and labels into a Tensorflow Queue. However, I then would like to actually read the file and process it with other Python libs. Yet, the CSV decoder / dequeuing operations always return tensor ops making interoperability with other Python libs impossible. Any ideas on how to mix TF's streaming operations with external libs?

file_path = tf.train.string_input_producer([csv_path])
reader = tf.TextLineReader()
_, csv_content = reader.read(file_path)

decode_op = tf.decode_csv(csv_content, record_defaults=[[""], [0]])
enqueue_ops.append(examples_queue.enqueue(decode_op))

tf.train.queue_runner.add_queue_runner(
   tf.train.queue_runner.QueueRunner(examples_queue, enqueue_ops)) 
...
sound_path, label_index = examples_queue.dequeue() <--- !!!!
data = read_wav(sound_path)

The sound_path is a tensor of dtype string but I am unable to actually cast/convert it into a Python string for compatibility with the read_wav lib.

Any ideas?

Upvotes: 0

Views: 339

Answers (2)

Olivier Moindrot
Olivier Moindrot

Reputation: 28198

Let's assume your read_wav function expects a filename (string) as argument, and returns a numpy array of float values (the decoded file).

The computation of read_wav takes place outside of the tensorflow graph, in python. Instead of calling sess.run(sound_path), which would create a lot of runs (one for each filename), you can use tf.py_func to encapsulate Python code in a TensorFlow wrapper. You need to specify the types of the outputs to tf.py_func.

sound_path, label_index = examples_queue.dequeue()
data = tf.py_func(your_function, [sound_path], [tf.float32])

Your function needs to take a numpy array as input and returns numpy arrays.

def your_function(sound_path):
    sound_path = sound_path[0]
    data = read_wav(sound_path)  # should be a numpy array
    return data

Upvotes: 3

nessuno
nessuno

Reputation: 27042

To extract the content of a tensor, you have to "run it" into a session.

sound_path_value = sess.run(sound_path)

Upvotes: 0

Related Questions