Shouldn't same neural network weights produce same results?

Question

So I am working with different deep learning frameworks as part of my research and have observed something weird (at least I cannot explain the cause of it).

I trained a fairly simple MLP model (on mnist dataset) in Tensorflow, extracted trained weights, created the same model architecture in PyTorch and applied the trained weights to PyTorch model. Now my expectation is to get same test accuracy from both Tensorflow and PyTorch models but this isn't the case. I get different results.

So my question is: If a model is trained to some optimal value, shouldn't the trained weights produce same results every time testing is done on the same dataset (regardless of the framework used)?

PyTorch Model:

class Net(nn.Module):

def __init__(self) -> None:
    super(Net, self).__init__()
    self.fc1 = nn.Linear(784, 24)
    self.fc2 = nn.Linear(24, 10)

def forward(self, x: Tensor) -> Tensor:
    x = torch.flatten(x, 1)
    x = F.relu(self.fc1(x))
    x = self.fc2(x)
    return x

Tensorflow Model:

def build_model() -> tf.keras.Model:
    # Build model layers
    model = models.Sequential()
    # Flatten Layer
    model.add(layers.Flatten(input_shape=(28,28)))
    # Fully connected layer
    model.add(layers.Dense(24, activation='relu'))
    model.add(layers.Dense(10))
    # compile the model
    model.compile(
        optimizer='sgd',
        loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
        metrics=['accuracy']
    )
    # return newly built model
    return model

To extract weights from Tensorflow model and apply them to Pytorch model I use following functions:

Extract Weights:

def get_weights(model):
    # fetch latest weights
    weights = model.get_weights()
    # transpose weights
    t_weights = []
    for w in weights:
        t_weights.append(np.transpose(w))
    # return
    return t_weights

Apply Weights:

def set_weights(model, weights):
    """Set model weights from a list of NumPy ndarrays."""
    state_dict = OrderedDict(
        {k: torch.Tensor(v) for k, v in zip(model.state_dict().keys(), weights)}
    )
    self.load_state_dict(state_dict, strict=True)

user11530462 · Accepted Answer

Providing solution in answer section for the benefit of community. From comments

If you are using the same weights in the same manner then results should be the same, though float rounding error should also be accounted. Also it doesn't matter if model is trained at all. You can think of your model architecture as a chain of matrix multiplications with element-wise nonlinearities in between. How big is the difference? Are you comparing model outputs, our metrics computed over dataset? As a suggestion, intialize model with some random values in Keras, do a forward pass for a single batch (paraphrased from jdehesa and Taras Sereda)

Shouldn't same neural network weights produce same results?

Answers (1)

Related Questions

Shouldn&#39;t same neural network weights produce same results?

Answers (1)

Related Questions

Shouldn't same neural network weights produce same results?