Are those Keras and PyTorch snippets equivalent?

Question

I am wondering if I succeeded in translating the following definition in PyTorch to Keras?

In PyTorch, the following multi-layer perceptron was defined:

from torch import nn
hidden = 128
def mlp(size_in, size_out, act=nn.ReLU):
    return nn.Sequential(
        nn.Linear(size_in, hidden),
        act(),
        nn.Linear(hidden, hidden),
        act(),
        nn.Linear(hidden, hidden),
        act(),
        nn.Linear(hidden, size_out),
    )

My translation is

from tensorflow import keras

from keras import layers

hidden = 128

def mlp(size_in, size_out, act=keras.layers.ReLU):
    return keras.Sequential(
        [
            layers.Dense(hidden, activation=None, name="layer1", input_shape=(size_in, 1)),
            act(),
            layers.Dense(hidden, activation=None, name="layer2", input_shape=(hidden, 1)),
            act(),
            layers.Dense(hidden, activation=None, name="layer3", input_shape=(hidden, 1)),
            act(),
            layers.Dense(size_out, activation=None, name="layer4", input_shape=(hidden, 1))
        ])

I am particularly confused about the input/output arguments, because that seems to be where tensorflow and PyTorch differ.

From the documentation:

When a popular kwarg input_shape is passed, then keras will create an input layer to insert before the current layer. This can be treated equivalent to explicitly defining an InputLayer.

So, did I get it right?

Ivan · Accepted Answer

In Keras, you can provide an input_shape for the first layer or alternatively use the tf.keras.layers.Input layer. If you do not provide either of these details, the model gets built the first time you call fit, eval, or predict, or the first time you call the model on some input data. So the input shape will actually be inferred if you do not provide it. See the docs for more details. PyTorch generally infers the input shape at runtime.

def keras_mlp(size_in, size_out, act=layers.ReLU):
    return keras.Sequential([layers.Input(shape=(size_in,)),
                             layers.Dense(hidden, name='layer1'),
                             act(),
                             layers.Dense(hidden, name='layer2'),
                             act(),
                             layers.Dense(hidden, name='layer3'),
                             act(),
                             layers.Dense(size_out, name='layer4')])

def pytorch_mlp(size_in, size_out, act=nn.ReLU):
    return nn.Sequential(nn.Linear(size_in, hidden),
                         act(),
                         nn.Linear(hidden, hidden),
                         act(),
                         nn.Linear(hidden, hidden),
                         act(),
                         nn.Linear(hidden, size_out))

You can compare their summary.

For Keras:

>>> keras_mlp(10, 5).summary()
Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 layer1 (Dense)              (None, 128)               1408      

 re_lu_6 (ReLU)              (None, 128)               0         

 layer2 (Dense)              (None, 128)               16512     

 re_lu_7 (ReLU)              (None, 128)               0         

 layer3 (Dense)              (None, 128)               16512     

 re_lu_8 (ReLU)              (None, 128)               0         

 layer4 (Dense)              (None, 5)                 645       

=================================================================
Total params: 35,077
Trainable params: 35,077
Non-trainable params: 0
_________________________________________________________________

For PyTorch:

>>> summary(pytorch_mlp(10, 5), (1,10))
============================================================================
Layer (type:depth-idx)                   Output Shape              Param #
============================================================================
Sequential                               [1, 5]                    --
├─Linear: 1-1                            [1, 128]                  1,408
├─ReLU: 1-2                              [1, 128]                  --
├─Linear: 1-3                            [1, 128]                  16,512
├─ReLU: 1-4                              [1, 128]                  --
├─Linear: 1-5                            [1, 128]                  16,512
├─ReLU: 1-6                              [1, 128]                  --
├─Linear: 1-7                            [1, 5]                    645
============================================================================
Total params: 35,077
Trainable params: 35,077
Non-trainable params: 0
Total mult-adds (M): 0.04
============================================================================
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.14
Estimated Total Size (MB): 0.14
============================================================================

Are those Keras and PyTorch snippets equivalent?

Answers (1)

Related Questions