Aristodog
Aristodog

Reputation: 159

Solving functional equations with tensorflow

Let us consider the functional equation

enter image description here

Mathematically, we know how to solve that equation in closed form, but what if we look for an approximate solution f in the form of a fully-connected 1 or 2-layer neural network with relu activation function?

What is the best way in Tensorflow to do a gradient descent on

enter image description here

for mini-batches of x's drawn randomly in [-10,10] ? My question arises from the fact that there is both f(x) and f(x+1) in the equation, and that differs from classical supervised learning.

Upvotes: 1

Views: 424

Answers (1)

fuglede
fuglede

Reputation: 18221

One approach would be to simply run through the network with x+1 as well. That is, for a two-layer network, you could have a model that looks as follows:

num_units_layer_1 = 200
num_units_layer_2 = 200

x = tf.placeholder(tf.float32, [None, 1])

seed = 42

weights = {
    'hidden1': tf.Variable(tf.random_normal([1, num_units_layer_1], seed=seed)),
    'hidden2': tf.Variable(tf.random_normal([num_units_layer_1, num_units_layer_2], seed=seed)),
    'output': tf.Variable(tf.random_normal([num_units_layer_2, 1], seed=seed))
}

biases = {
    'hidden1': tf.Variable(tf.random_normal([num_units_layer_1], seed=seed)),
    'hidden2': tf.Variable(tf.random_normal([num_units_layer_2], seed=seed)),
    'output': tf.Variable(tf.random_normal([1], seed=seed))
}

def model_f(x):
    hidden_layer_1 = tf.add(tf.matmul(x, weights['hidden1']), biases['hidden1'])
    hidden_layer_1 = tf.nn.relu(hidden_layer_1)
    hidden_layer_2 = tf.add(tf.matmul(hidden_layer_1, weights['hidden2']), biases['hidden2'])
    hidden_layer_2 = tf.nn.relu(hidden_layer_2)
    return tf.matmul(hidden_layer_2, weights['output']) + biases['output']

output_layer = model_f(x)
output_layerp = model_f(x+1)

in_range = tf.logical_and(x >= 0, x <= 1)
target_x = tf.where(in_range, output_layer, x)

cost = tf.reduce_mean((output_layerp - output_layer - x**2)**2) + tf.reduce_mean((target_x - x)**2)

optimizer = tf.train.AdamOptimizer(learning_rate=0.01).minimize(cost)

init = tf.initialize_all_variables()

Then, when estimating the parameters you can simply generate batches as you go:

with tf.Session() as sess:
    sess.run(init)

    # Estimate
    for epoch in range(5000):
        sample = np.random.uniform(-10, 10, (400, 1))
        _, c = sess.run([optimizer, cost], feed_dict = {x: sample})
        if epoch % 1000 == 999:
            print(f'Epoch {epoch}, cost: {c}')

    # Make predictions and plot result
    xs = np.linspace(-10, 10, 500).reshape(500, 1)
    predictions = sess.run(output_layer, feed_dict={x: xs})
    plt.plot(xs, predictions)

This generates the following output:

enter image description here

We can compare this with what you get by simply using the functional equation to define f recursively:

def f(x):
    if x >= 0 and x <= 1:
        return x
    if x > 1:
        return f(x-1) + (x-1)**2
    if x < 0:
        return f(x+1) - x**2

plt.plot(xs, [f(x[0]) for x in xs])
plt.plot(xs, predictions)

enter image description here

Pretty much spot on. It doesn't generalize to other ranges though:

enter image description here

Upvotes: 2

Related Questions