Simple Gradient Descent in Python vs Keras

Question

I am practicing neural networks by building my own in notebooks. I am trying to check my model against an equivalent model in Keras. My model seems to work the same as other simple coded neural network frameworks, such as this one: https://towardsdatascience.com/coding-neural-network-forward-propagation-and-backpropagtion-ccf8cf369f76

However, as I increase the number of epochs the weights of the Keras model slowly diverge from my own. I am attempting to train the network using simple gradient descent, with the batch size equalling the whole training set, setting the initialised weights to the same as the initialised weights in my model. (I have been doing this on the Iris data set for now, hence the batch size = 150.)

Is there something default happening in Keras here that means the model I'm describing below is functioning slightly differently to my model (or the one described in the article)? Like batch normalisation or something?

from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Input

network_shape = np.array([4, 20, 10, 1])
activations = ["relu", "relu", "sigmoid"]
model = Sequential()
model.add(Input(shape=(network_shape[0],)))
for i in range(len(activations)):
    model.add(Dense(units=network_shape[i + 1], activation=activations[i]))

model.set_weights(set_weights)

sgd = keras.optimizers.SGD(learning_rate=alpha, momentum=0.0)
model.compile(loss='binary_crossentropy', optimizer=sgd)

model.fit(X.T, y.T, batch_size=150, epochs=n_iter, verbose=0, shuffle=False)

Simple Gradient Descent in Python vs Keras

Answers (1)

Related Questions