Jack Robson
Jack Robson

Reputation: 31

Convert keras model to pytorch

Is there an easy way to convert a model like this from keras to pytorch?

I have the code in keras as following:

from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.regularizers import l2

state_dim = 10
architecture = (256, 256)  # units per layer
learning_rate = 0.0001  # learning rate
l2_reg = 0.00000001  # L2 regularization
trainable = True
num_actions = 3

layers = []
n = len(architecture) # n = 2

for i, units in enumerate(architecture, 1):
    layers.append(Dense(units=units,
                        input_dim=state_dim if i == 1 else None,
                        activation='relu',
                        kernel_regularizer=l2(l2_reg),
                        name=f'Dense_{i}',
                        trainable=trainable))
    
layers.append(Dropout(.1))
layers.append(Dense(units=num_actions,
                    trainable=trainable,
                    name='Output'))

model = Sequential(layers)
model.compile(loss='mean_squared_error',
            optimizer=Adam(lr=learning_rate))

Which outputs as follow:

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
Dense_1 (Dense)              (None, 256)               2816      
_________________________________________________________________
Dense_2 (Dense)              (None, 256)               65792     
_________________________________________________________________
dropout_3 (Dropout)          (None, 256)               0         
_________________________________________________________________
Output (Dense)               (None, 3)                 771       
=================================================================
Total params: 69,379
Trainable params: 69,379
Non-trainable params: 0
_________________________________________________________________
None

I must admit, I'm a little out of my depth so any advice is appreciated. I'm trying to read through the pytorch docs and will update my question with a possible answer if I manage.

Upvotes: 0

Views: 1251

Answers (1)

Jack Robson
Jack Robson

Reputation: 31

Here is my best attempt:

state_dim = 10
architecture = (256, 256)  # units per layer
learning_rate = 0.0001  # learning rate
l2_reg = 0.00000001  # L2 regularization
trainable = True
num_actions = 3

import torch
from torch import nn

class CustomModel(nn.Module):
    def __init__(self):
        
        super().__init__()
        
        self.layers = nn.Sequential(
            nn.Linear(state_dim, architecture[0]),
            nn.ReLU(),
            nn.Linear(architecture[0], architecture[1]),
            nn.ReLU(),
            nn.Dropout(0.25),
            nn.Linear(architecture[1], num_actions),
        )

    def forward(self, x):
        return self.layers(x)

model = CustomModel()
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

It outputs a promising looking output:

CustomModel(
    (layers): Sequential(
        (0): Linear(in_features=10, out_features=256, bias=True)
        (1): ReLU()
        (2): Linear(in_features=256, out_features=256, bias=True)
        (3): ReLU()
        (4): Dropout(p=0.25, inplace=False)
        (5): Linear(in_features=256, out_features=3, bias=True)
    )
)

However a few items are still left unanswered:

  1. are the activations in the right place?
  2. how do we add a kernel_regularizer = l2(l2_reg) to the first two Linear/Dense layers?
  3. and how do we make the layers trainable?

Any input appreciated.

Upvotes: 1

Related Questions