Reputation: 11
I am practicing neural networks by building my own in notebooks. I am trying to check my model against an equivalent model in Keras. My model seems to work the same as other simple coded neural network frameworks, such as this one: https://towardsdatascience.com/coding-neural-network-forward-propagation-and-backpropagtion-ccf8cf369f76
However, as I increase the number of epochs the weights of the Keras model slowly diverge from my own. I am attempting to train the network using simple gradient descent, with the batch size equalling the whole training set, setting the initialised weights to the same as the initialised weights in my model. (I have been doing this on the Iris data set for now, hence the batch size = 150.)
Is there something default happening in Keras here that means the model I'm describing below is functioning slightly differently to my model (or the one described in the article)? Like batch normalisation or something?
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Input
network_shape = np.array([4, 20, 10, 1])
activations = ["relu", "relu", "sigmoid"]
model = Sequential()
model.add(Input(shape=(network_shape[0],)))
for i in range(len(activations)):
model.add(Dense(units=network_shape[i + 1], activation=activations[i]))
model.set_weights(set_weights)
sgd = keras.optimizers.SGD(learning_rate=alpha, momentum=0.0)
model.compile(loss='binary_crossentropy', optimizer=sgd)
model.fit(X.T, y.T, batch_size=150, epochs=n_iter, verbose=0, shuffle=False)
Upvotes: 0
Views: 71
Reputation: 131
If you want to train an identical model to the one from the article, you'll need identical initial weights and hyperparameters. Unless you're learning a very simple model, like y= mx + b
, once your number of epochs exceeds the example model, the weights won't be identical.
Upvotes: 0