Reputation: 11
I want to approximate cos(x) from 0 to pi/4 using a quadratic polynomial.
I believe I could train a perceptron using (say 1000) points in my range, with training inputs (x^2, x, 1) and with training outputs sigmoid(cos(x)). Then, the resulting weights of the neuron would be the coefficients of the polynomial (namely w1x^2 + w2x + w3)
Here is my attempt (but it is not converging)
import numpy
import matplotlib.pyplot as plt
import random
import math
#Generate random points in range
R = [random.uniform(0, math.pi/4) for i in range(1000)]
def sigmoid(x):
return 1.0/(1.0 + numpy.exp(-x))
def sigmoid_prime(x):
return sigmoid(x)*(1 - sigmoid(x))
#Generate training inputs
training_inputs = numpy.ones((len(R), 3))
for index, x in enumerate(R):
training_inputs[index, 0] = x**2
training_inputs[index, 1] = x
#Generate training outputs
training_outputs = numpy.array([sigmoid(math.cos(i)) for i in R]).reshape(len(R), 1)
#Arbitrary weights
weights = numpy.ones((3, 1))
#Neural Network
def train_nn(training_inputs, training_outputs, initial_weights, niter, errors_data):
w = initial_weights
for ii in range(niter):
#forward propagation
outputs = sigmoid(numpy.dot(training_inputs, w))
#backward propagation
errors = training_outputs - outputs
deltaw = errors*sigmoid_prime(outputs)
deltaw = numpy.dot(training_inputs.T, deltaw)
w += deltaw
#save errors
errors_data[ii] = errors.reshape(len(R),)
return outputs, w
NITER = 5000
errors = numpy.zeros((NITER, len(R)))
outputs, weights = train_nn(training_inputs, training_outputs, weights, NITER, errors)
#coefficients
print(weights)
Upvotes: 1
Views: 77
Reputation: 1419
It seems you have missed to put some learning rate, which means in your case it is 1. This is too high. When the learning rate is set too high, the optimization algorithm (gradient descent in your case) can diverge, overshoot the minimum or oscillate around.
So update your weights by some factor < 1, like this: w += 0.01*deltaw
# how intensive update the weights
learning_rate = 0.01
#Neural Network
def train_nn(training_inputs, training_outputs, initial_weights, niter, errors_data):
w = initial_weights
for ii in range(niter):
#forward propagation
outputs = sigmoid(numpy.dot(training_inputs, w))
#backward propagation
errors = training_outputs - outputs
deltaw = errors*sigmoid_prime(outputs)
deltaw = numpy.dot(training_inputs.T, deltaw)
w += learning_rate*deltaw # updated line
#save errors
errors_data[ii] = errors.reshape(len(R),)
return outputs, w
Recommendations:
Mathematically you should move to negative direction of your gradient, and not positive, but since one of the gradients is negative, so technically it is correct. It would be clearer to have correct mathematical formulations in each line, like this:
deltaw = -numpy.dot(training_inputs.T, deltaw)
w -= deltaw
If you, then want to extend this to cos(x) full range, i.e. to approximate from 0 to pi, than sigmoid activation function will not be enough, since it ranges form 0 to 1, but cos(x) from -1 to 1, so you may try hyperbolic tangent function instead.
The resulting weights will not be the same as 2nd degree of Taylor polynomial. You can not use them separately without sigmoid function (if it was your initial intent). Instead you can simply use linear regression with your polynomial feature.
Upvotes: 0