lucky yang
lucky yang

Reputation: 1669

How to map a dataset of filenames to a dataset of file contents

For example, I have a tensorflow dataset where each element is a tf.string Tensor represents a filename of an image file. Now I want to map this filename dataset to a dataset of image content Tensors.

I wrote code like this, but it doesn't work because map function can't execute eagerly. (Raises an error saying Tensor type has no attribute named numpy.)

def parseline(line):
    filename = line.numpy()
    image = some_library.open_image(filename).to_numpy()
    return image

dataset = dataset.map(parseline)

Upvotes: 0

Views: 2643

Answers (1)

Sharky
Sharky

Reputation: 4543

Basically, it can be done the following way:

path = 'path_to_images'

files = [os.path.join(path, i) for i in os.listdir(path)] # If you need to create a list of filenames, because tf functions require tensors

def parse_image(filename):
    file = tf.io.read_file(filename) # this will work only with filename as tensor
    image = tf.image.decode_image(f)
    return img

dataset = tf.data.Dataset.from_tensor_slices(files)
dataset = dataset.map(parse_image).batch(1)

if you're in eager mode just iterate over dataset

 for i in dataset:           
    print(i)

If not, you'll need an iterator

iterator = dataset.make_one_shot_iterator()
with tf.Session as sess:
    sess.run(iterator.get_next())

Upvotes: 1

Related Questions