Reputation: 159
Let us consider the functional equation
Mathematically, we know how to solve that equation in closed form, but what if we look for an approximate solution f
in the form of a fully-connected 1 or 2-layer neural network with relu activation function?
What is the best way in Tensorflow to do a gradient descent on
for mini-batches of x's
drawn randomly in [-10,10] ?
My question arises from the fact that there is both f(x)
and f(x+1)
in the equation, and that differs from classical supervised learning.
Upvotes: 1
Views: 424
Reputation: 18221
One approach would be to simply run through the network with x+1
as well. That is, for a two-layer network, you could have a model that looks as follows:
num_units_layer_1 = 200
num_units_layer_2 = 200
x = tf.placeholder(tf.float32, [None, 1])
seed = 42
weights = {
'hidden1': tf.Variable(tf.random_normal([1, num_units_layer_1], seed=seed)),
'hidden2': tf.Variable(tf.random_normal([num_units_layer_1, num_units_layer_2], seed=seed)),
'output': tf.Variable(tf.random_normal([num_units_layer_2, 1], seed=seed))
}
biases = {
'hidden1': tf.Variable(tf.random_normal([num_units_layer_1], seed=seed)),
'hidden2': tf.Variable(tf.random_normal([num_units_layer_2], seed=seed)),
'output': tf.Variable(tf.random_normal([1], seed=seed))
}
def model_f(x):
hidden_layer_1 = tf.add(tf.matmul(x, weights['hidden1']), biases['hidden1'])
hidden_layer_1 = tf.nn.relu(hidden_layer_1)
hidden_layer_2 = tf.add(tf.matmul(hidden_layer_1, weights['hidden2']), biases['hidden2'])
hidden_layer_2 = tf.nn.relu(hidden_layer_2)
return tf.matmul(hidden_layer_2, weights['output']) + biases['output']
output_layer = model_f(x)
output_layerp = model_f(x+1)
in_range = tf.logical_and(x >= 0, x <= 1)
target_x = tf.where(in_range, output_layer, x)
cost = tf.reduce_mean((output_layerp - output_layer - x**2)**2) + tf.reduce_mean((target_x - x)**2)
optimizer = tf.train.AdamOptimizer(learning_rate=0.01).minimize(cost)
init = tf.initialize_all_variables()
Then, when estimating the parameters you can simply generate batches as you go:
with tf.Session() as sess:
sess.run(init)
# Estimate
for epoch in range(5000):
sample = np.random.uniform(-10, 10, (400, 1))
_, c = sess.run([optimizer, cost], feed_dict = {x: sample})
if epoch % 1000 == 999:
print(f'Epoch {epoch}, cost: {c}')
# Make predictions and plot result
xs = np.linspace(-10, 10, 500).reshape(500, 1)
predictions = sess.run(output_layer, feed_dict={x: xs})
plt.plot(xs, predictions)
This generates the following output:
We can compare this with what you get by simply using the functional equation to define f recursively:
def f(x):
if x >= 0 and x <= 1:
return x
if x > 1:
return f(x-1) + (x-1)**2
if x < 0:
return f(x+1) - x**2
plt.plot(xs, [f(x[0]) for x in xs])
plt.plot(xs, predictions)
Pretty much spot on. It doesn't generalize to other ranges though:
Upvotes: 2