Transferring Tensorflow weights to an equivalent Pytorch model

Question

I had an old implementation of Unet in Tensorflow that has been trained on custom data. I have saved the weights in .hdf5 file format. Now, I want to convert my codes to Pytorch and I already implemented an equivalent model in Pytorch. However, I have challenge in using the weights in the new Pytorch model. To convert Tensorflow weights to Pytorch weight, I copy weights from tensorflow (layer by layer) to a state_dict dictionary from my pytorch model (as explained in the code) and load the model with this new dictionary. However, the final Pytorch model does not have similar output as the Tensorflow model (the output is garbage).

Is there anything I am missing here? note that in each layer, I had to transpose the weight in order to becom similar to Pytorch format. I think the problem should be here. But I don't know how to fix it. any guidence to how to approach this problem is also helpful

def weight_loading(pretrained_weights):
    # Load the weights
    tf_model = tf.keras.models.load_model(pretrained_weights)
    tf_weights = tf_model.get_weights()

    # Load the PyTorch model
    pt_model = UNet() #implemented based on the previous model (by myself)
    initial_state_dict = pt_model.state_dict()
    new_state_dict = {}
    with torch.no_grad():
        x = 0
        for i, layer in enumerate(pt_model.modules()):
            if isinstance(layer, torch.nn.Conv2d):
                # extract the weights and biases from the TensorFlow weights
                weight_tf = tf_weights[x*2]
                bias_tf = tf_weights[x*2+1]
             
                # convert the weights and biases to PyTorch format
                weight_pt = torch.tensor(weight_tf.transpose())
                bias_pt = torch.tensor(bias_tf)

                # get the name of the weight and bias tensors
                weight_name = list(pt_model.named_parameters())[x*2][0]
                bias_name = list(pt_model.named_parameters())[x*2+1][0]

                # set the weights and biases in the PyTorch model state_dict
                new_state_dict[weight_name]= weight_pt
                new_state_dict[bias_name] = bias_pt

                x = x + 1

            if isinstance(layer, torch.nn.ConvTranspose2d):
                weight_tf = tf_weights[x*2]
                bias_tf = tf_weights[x*2+1]
                
                # convert the weights and biases to PyTorch format
                weight_pt = torch.tensor(np.transpose(weight_tf, (2, 3, 0, 1)))
                bias_pt = torch.tensor(bias_tf)

                # get the name of the weight and bias tensors
                weight_name = list(pt_model.named_parameters())[x*2][0]
                bias_name = list(pt_model.named_parameters())[x*2+1][0]

                # set the weights and biases in the PyTorch model state_dict
                new_state_dict[weight_name] = weight_pt
                new_state_dict[bias_name] = bias_pt

                x = x + 1

    # load the new generated state_dict to pt_model
    pt_model.load_state_dict(new_state_dict)
    return pt_model

In this code, I copied weights from a Tensorflow model to a Pytorch model (layer by layer). each layer is a Cov2d or a ConvTranspose2d. I expect that when I load the Pytorch model with converted weights and run it for an image, I have an output similar to the Tensorflow model output for the same image. But they were not the same and they were very different.

Update: I checked the output of two models after first maxpooling in unet (after two conv layers) and they were slightly different (in comparison with the output from randomly initiated pytorch model which was very different).

Transferring Tensorflow weights to an equivalent Pytorch model

Answers (1)

Related Questions