maxest
maxest

Reputation: 91

TensorFlow: approximating a function

I wrote a simple TensorFlow program which doesn't work. Here's the problem I'm trying to solve. Given x as input, I would like to roughly evaluate a function that returns a value of 0.0 if x is in <0, 0.33> or <0.66, 1.0> interval, and 1.0 if x is in (0.33, 0.66) interval.

Here is the code:

import tensorflow as tf
import numpy
import scipy


# input and output
x = tf.placeholder(tf.float32, shape=[None, 1])
y_true = tf.placeholder(tf.float32, shape=[None, 1])


# vars
weights = tf.Variable(tf.zeros([1, 1]))
biases = tf.Variable(tf.zeros([1]))


logits = tf.matmul(x, weights) + biases
y_pred = tf.nn.softmax(logits)
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y_true)
cost = tf.reduce_mean(cross_entropy)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01).minimize(cost)


x_train = [ [0.1], [0.2], [0.3], [0.4], [0.5], [0.6], [0.7], [0.8], [0.9] ]
y_train = [ [0.0], [0.0], [0.0], [1.0], [1.0], [1.0], [0.0], [0.0], [0.0] ]


sess = tf.Session()
sess.run(tf.global_variables_initializer())
for i in range(100):
  sess.run(optimizer, {x: x_train, y_true: y_train})


we, bi = sess.run([weights, biases])
print("we: %s bi: %s"%(we, bi))

answer = sess.run(y_pred, feed_dict={x: x_train})
print(answer)

The values in weights and biases are just plain wrong after the training. They are all 1s even after the first iteration and don't change after.

The code I wrote is based on some code that was used for digit recognition and I thought I would "minimize" the problem to one number/"pixel".

Any ideas what to try other than changing iterations count or learning rate?

EDIT: So I managed to solve my problem with using sigmoid, as suggested below, and using more layers. Here is the working code:

import tensorflow as tf
import numpy


# consts
input_num_units = 1
hidden1_num_units = 8
hidden2_num_units = 16
output_num_units = 1


# input and output
x = tf.placeholder(tf.float32, shape=[None, 1])
y_true = tf.placeholder(tf.float32, shape=[None, 1])


# vars
weights = {
    'hidden1': tf.Variable(tf.random_normal([input_num_units, hidden1_num_units])),
    'hidden2': tf.Variable(tf.random_normal([hidden1_num_units, hidden2_num_units])),
    'output': tf.Variable(tf.random_normal([hidden2_num_units, output_num_units]))
}

biases = {
    'hidden1': tf.Variable(tf.random_normal([hidden1_num_units])),
    'hidden2': tf.Variable(tf.random_normal([hidden2_num_units])),
    'output': tf.Variable(tf.random_normal([output_num_units]))
}


hidden_layer_1 = tf.add(tf.matmul(x, weights['hidden1']), biases['hidden1'])
hidden_layer_1 = tf.nn.sigmoid(hidden_layer_1)

hidden_layer_2 = tf.add(tf.matmul(hidden_layer_1, weights['hidden2']), biases['hidden2'])
hidden_layer_2 = tf.nn.sigmoid(hidden_layer_2)

output_layer = tf.matmul(hidden_layer_2, weights['output']) + biases['output']
output_value = tf.nn.sigmoid(output_layer)

cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=output_layer, labels=y_true))
optimizer = tf.train.AdamOptimizer(learning_rate=0.01).minimize(cost)


x_train = [ [0.1], [0.2], [0.3], [0.4], [0.5], [0.6], [0.7], [0.8], [0.9] ]
y_train = [ [0.75], [0.0], [0.0], [1.0], [0.5], [1.0], [0.0], [0.0], [0.0] ]


sess = tf.Session()
sess.run(tf.global_variables_initializer())
for i in range(10000):
  sess.run(optimizer, {x: x_train, y_true: y_train})


answer = sess.run(output_value, feed_dict={x: x_train})
print(answer)

To see whether my model works well I actually plotted a whole set of values in <0, 1> interval and turned out, after going through the network, they produced pretty much what I was expecting. This can be fiddled with. I noticed for instance that the more iterations I perform the more "steep" the function becomes and more smooth if a few iterations and performed.

Upvotes: 0

Views: 330

Answers (1)

Dr. Snoopy
Dr. Snoopy

Reputation: 56377

The weights don't change because the output never changes, it is always 1.0, this happens because you apply softmax to a single output, instead of a vector. You should use sigmoid activation for this case.

Just replace softmax_cross_entropy_with_logits with sigmoid_cross_entropy_with_logits. You should also initialize weights with a non-zero value, ideally a random value in a small range.

Upvotes: 2

Related Questions