jordan
jordan

Reputation: 751

How to use the black/white image as the input to tensorflow

When implementing the reinforcement learning with tensorflow, the inputs are black/white images. Each pixel can be represented as a bit 1/0.

Can I give the data directly to tensorflow, with each bit as a feature? Or I had to expand the bits to bytes before sending to tensorflow? I'm new to tensorflow, so some code example would be nice.

Thanks

Upvotes: 0

Views: 1243

Answers (1)

anand_v.singh
anand_v.singh

Reputation: 2838

You can directly load the Image data as you would normally do, the Image being binary will have no effect other that the input channel width becoming 1 for the input.

Whenever you put an Image through a convnet, each output filter generally learns features for all the channels, so in case of a binary image, there is a separate kernel defined for each input channel / output channel combination (Since Only 1 input channel) in the first layer.

Each channel is defined by it's number of filters and there exists a 2D kernel for each input channel which averages over all filters, so you will have weights/parameters equal to input_channels * number_of_filters * filter_dims, here for the first layer input_channels becomes one.

Since you asked for some sample code. Let your image be in a tensor X, simply use

X_out = tf.nn.conv2d(X, filters = 6, kernel_size = [height,width])

After that you can apply an activation, this will make your output image have 6 channels. If you face any problem or have some doubts, feel free to comment, for theoretical clarification, check out https://www.coursera.org/learn/convolutional-neural-networks/lecture/nsiuW/one-layer-of-a-convolutional-network

Edit

Since the question was about simple neural net, not conv net, here is the code for that,

X_train is the variable in which image is stored as (n_x,n_x) byte resolution, n_x is used later.

You will need to flatten the input.

X_train_flatten = X_train_orig.reshape(X_train_orig.shape[0], -1).T This first flattens the image horizontally and then transposes it to arrange it vertically.

Then you will create placeholder tensor X as :

X = tf.placeholder(tf.bool,[n_x*n_x,None]) #Your Input tensor should have dimension same as your input layer.

let W, b be weight and bias respectively.

Z1 = tf.add(tf.matmul(W1,X),b1)  #Linear Transformation step
A1 = tf.nn.relu(Z1)              #Activation Step

And you keep on creating your graph, I think that answers your question, if not let me know.

Upvotes: 0

Related Questions