momo
momo

Reputation: 1122

ValueError: TensorFlow requires that the following symbols must be defined before the loop

I am trying to create an input pipeline using the tf.data API. I have 3D data and using normal NumPy operations I would've ended up with an array with dimensions [?,256x256x3x100], which one can think of as 100 frames each of 256x256x3 size.

import glob
import os
import numpy as np
import tensorflow.compat.v1 as tf

def readfile(filenames):
    flag = 0
    for name in filenames:
        string = tf.read_file(name)
        image = tf.image.decode_image(string, channels=3)
        if flag == 0:
            bunch = image
            flag = 1
        else:
            bunch = tf.concat([bunch,image],1)   
    return bunch

with tf.device("/cpu:0"):
    train_files = []
    for s in [x[0] for x in os.walk("path/to/data/folders")]:
        if(s == "path/to/data/folders"):
            continue
        train_files.append(glob.glob(s+"/*.png"))
    # shape of train_files is [5,100]
    train_dataset = tf.data.Dataset.from_tensor_slices(train_files)
    train_dataset = train_dataset.map(readfile, num_parallel_calls=16)

I think the error is occurring because 'bunch' is changing size in for loop. Error:

ValueError                                Traceback (most recent call last)
<ipython-input-13-c2f88ca344dc> in <module>
      22     train_dataset = train_dataset.map(
 ---> 23             readfile, num_parallel_calls=16)


ValueError: in converted code:

ValueError: TensorFlow requires that the following symbols must be defined before the loop: ('bunch',)

How do I read the data correctly?

EDIT

What worked for me:

def readfile(filenames):
    flag = 0
    name = filenames[0]
    string = tf.read_file(name)
    image = tf.image.decode_image(string, channels=3)
    bunch = image
    for name in filenames:
        string = tf.read_file(name)
        image = tf.image.decode_image(string, channels=3)
        if flag == 0:
            bunch = image
            flag = 1
        else:
            bunch = tf.concat([bunch,image],1)   
    return bunch

So I'm not sure why it is necessary to initialise bunch before the loop, when the first iteration should take care of that bunch = image. It might be because flag is not defined as a tensor so bunch = image is never actually run?

Upvotes: 2

Views: 2137

Answers (2)

learner
learner

Reputation: 3472

The variable bunch is created inside the function readfile() and therefore the error, because variables cannot be created inside the loop at run time. A fix would be to move the declaration of the variable bunch outside the loop. Code sample follows:

import glob
import os
import numpy as np
import tensorflow.compat.v1 as tf

def readfile(filenames):
    flag = 0
    bunch = <some_appropriate_initialization>
    for name in filenames:
        string = tf.read_file(name)
        image = tf.image.decode_image(string, channels=3)
        if flag == 0:
            bunch = image
            flag = 1
        else:
            bunch = tf.concat([bunch,image],1)   
    return bunch

# Rest of the code

Upvotes: 1

Mahendra Singh Meena
Mahendra Singh Meena

Reputation: 608

You can't use arbitrary python code inside a dataset.map function, that is readfile in your case. There are two ways to solve this:

  1. By using readfile code as it is and by calling it astf.py_function instead, here you can do eager execution, hence you can write any python logic as normal.

  2. By converting the code in readfile and making use of only tensorflow functions to do the transformation. Performance-wise this is much better than using tf.py_function.

You can find an example on both at https://www.tensorflow.org/api_docs/python/tf/py_function

Upvotes: 1

Related Questions