Reputation: 13
I use the following code to load a bunch of images in my data set in TensorFlow, which works well:
def load(image_file):
image = tf.io.read_file(image_file)
image = tf.image.decode_jpeg(image)
image = tf.cast(image , tf.float32)
return image
train_dataset = tf.data.Dataset.list_files(PATH+'train/*.jpg')
train_dataset = train_dataset.map(load , num_parallel_calls=tf.data.experimental.AUTOTUNE)
I am wondering how I can use a similar code to load a bunch of CSV files. Each CSV file has a shape 256 x 256 and can be assumed as a grayscale image. I don't know what I should use instead of "tf.image.decode_jpeg" in the "load" function. I would really appreciate your help.
Upvotes: 1
Views: 1611
Reputation:
You can achieve this by changing a few things in the load function like below.
def load(image_file):
image_file = bytes.decode(image_file.numpy())
image = pd.read_csv(image_file)
image = image.values
image = tf.convert_to_tensor(image, dtype=tf.float32,)
return image
train_dataset = tf.data.Dataset.list_files(PATH+"/*.csv")
print(train_dataset)
train_dataset = train_dataset.map(lambda x: tf.py_function(load,[x],[tf.float32]) , num_parallel_calls=tf.data.experimental.AUTOTUNE)
Wrap the load fucntion with tf.py_function
in map
, so you can use decode the file name
.
Example output:
for i in train_dataset.take(1):
print(i)
(<tf.Tensor: shape=(256, 256), dtype=float32, numpy=
array([[255., 255., 255., ..., 255., 255., 255.],
[255., 255., 255., ..., 255., 255., 255.],
[255., 255., 255., ..., 255., 255., 255.],
...,
[255., 255., 255., ..., 255., 255., 255.],
[255., 255., 255., ..., 255., 255., 255.],
[255., 255., 255., ..., 255., 255., 255.]], dtype=float32)>,)
Upvotes: 2