costisst
costisst

Reputation: 391

How to implement RGB images as tensors in tensorflow?

I'm new to tensorflow and I'm trying to create a model of Stacked Sparse Denoising Auto-encoders. I have found a way on how to load my training ( and testing) set through examples from here and github but I cannot use them as a tensor to perform the required multiplications etc. (this code is only for loading the images)

import tensorflow as tf
import glob
import numpy as np
from PIL import Image as im

im_list = []

#LOAD ALL SETS
training_set = []
training_set = glob.glob("folder/training_set/*.jpg")

testing_set = []
testing_set = glob.glob("folder/corrupted/*.jpg") 

# testing my code only for the training set
filename_queue = tf.train.string_input_producer(training_set)

reader = tf.WholeFileReader()
key, value = reader.read(filename_queue)

#data = tf.image.decode_jpeg(value)
data = tf.decode_raw(value, tf.uint8)

sess = tf.InteractiveSession()

sess.run(tf.global_variables_initializer())
#sess.run(tf.global_variables_initializer())
sess.run(tf.local_variables_initializer())

coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)

for i in range (196):
    print i
    m_key = sess.run([key,data])
    im_list.append(m_key[1])



coord.request_stop()
coord.join(threads)

By using this code I manage to save all my images as list of uint8 arrays containing the data but their size is from ~800 to ~1000 . My images are of size 32x32x3 so something is missing.

They other way I tried is:

filename_queue = tf.train.string_input_producer(training_set)

image_reader = tf.WholeFileReader()

_, image_file = image_reader.read(filename_queue)

imagee = tf.image.decode_jpeg(image_file)

#tf.cast(imagee, tf.float32)

sess = tf.InteractiveSession()

sess.run(tf.global_variables_initializer())

coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)

image = sess.run(imagee) 

imaginar = image.astype(np.float32)

#train_step.run(feed_dict={x: imaginar, y_: imaginar_test})

coord.request_stop()
coord.join(threads)

and im trying to calculate

y = tf.matmul(x,W) + b           
h_x_s = tf.sigmoid(y)
h_x = tf.matmul(h_x_s,W_) + b_
y_xi = tf.sigmoid(h_x) 

This way my images are numpy arrays of 32x32x3 but I cant find a way to save them as tensor so tf.matmul works. I always get errors about the non fitting shapes of my arrays.

# VARIABLES
x= tf.placeholder(tf.float32,[32, 32, 3])
y_ = tf.placeholder(tf.float32,[32, 32, 3])

W = tf.Variable(tf.zeros([32,32,3]))
b = tf.Variable(tf.zeros([32,32,3]))

W_ = tf.Variable(tf.zeros([32,32,3]))
b_= tf.Variable(tf.zeros([32,32,3]))

(Unsuccessful try)

How should I load (and decode) my images and what sizes should my Variables and placeholders be? Any help would be much appreciated!

Thanks :)

Upvotes: 2

Views: 6010

Answers (1)

costisst
costisst

Reputation: 391

Just in case anyone has the same problem:

First of all use decode_jpeg(data, channels = 3) (channels = 3 means RGB) or other decoder depending on your image type.

So what you can do is turn the 3D image to 2D vector. For example if the image is (32,32,3) your vector should be (1,32*32*3) -> (1, 3072). You can do that using

2d_vec = original_3d_image.reshape(1,-1)

you can turn it back to 3D by using

2d_vec.reshape(32,32,3)

Do not forget to normalize your data before you use them as an input. All you have to do is

2d_vec = 2d_vec / max_value_of_2d_vec

I have changed a lot in the code I posted before so if you have any questions ask me!

Upvotes: 5

Related Questions