TensorFlow 2.0 - Model subclassing : no input dimension

I was reading the TensorFlow 2.0 Tutorial and I came across model subclassing to create TensorFlow 2.0 models.

The code I found was:

class MyModel(Model):
  def __init__(self):
    super(MyModel, self).__init__()
    self.conv1 = Conv2D(32, 3, activation='relu')
    self.flatten = Flatten()
    self.d1 = Dense(128, activation='relu')
    self.d2 = Dense(10, activation='softmax')

  def call(self, x):
    x = self.conv1(x)
    x = self.flatten(x)
    x = self.d1(x)
    return self.d2(x)

# Create an instance of the model
model = MyModel()

Now, in this code, my confusion is, the author of the code doesn't define the inputs?

There is no-

self.input_layer = Input(
            shape = (28, 28)
            )

# OR-

self.conv1 = Conv2D(32, 3, activation='relu', input_dim = (28, 28)

How does the defined model then know how many attributes/features to expect from the training data?

Thanks

Upvotes: 2

Answers (2)

FraSchelle

Reputation: 329

In tensorflow/keras, a Model is a succession of layers, that can serve as a sub-block for more complicated networks. The basic notion there is recurrence.

For most machine learning libraries, though, a model is an algorithm that takes input, process it and generate an output (well, it's an algorithm!), that is supposed to be related to the input via decision theory. As such it can be trained and tested on a given dataset, or at least using dataset prescriptions (size and type of datas for instance). The basic notion there is to render a static code.

So the two visions (recurrent construction versus static construction) are somehow incompatible, but turaround may be found. The way I did it is summarized in the below example

class MultiLayerPerceptron(Model):

    def __init__(
            self,
            d_input: int = 32,
            d_hidden: int = 64,
            d_output: int = 16,
            dropout: float = 0.0):
        self.d_input = int(d_input)
        self.d_hidden = int(d_hidden)
        self.d_output = int(d_output)
        self.dropout = float(dropout)
        inputs = Input(shape=(self.d_input,), name='input')
        x = Dense(self.d_hidden, activation='relu', name='hidden')(inputs)
        x = Dropout(self.dropout, name='dropout')(x)
        outputs = Dense(self.d_output, activation='sigmoid', name='output')(x)
        super().__init__(inputs=inputs, outputs=outputs)
        loss_function = tf.keras.losses.BinaryCrossentropy(from_logits=False)
        optimizer = tf.keras.optimizers.Adam()
        super().compile(
            optimizer=optimizer, loss=loss_function, metrics=['accuracy'])
        return None

    def predict(self, x):
        y = super().predict(x)
        return tf.round(y).numpy().astype(int)

    def transform(self, x):
        return self(x)

which --I hope-- is sufficiently clear. I simply post-poned the call of super().__init__ to the point where the network is constructed. One got a striking resemblance with scikit-learn/machine-learning terminology then.

Nevertheless, note that

one has lost the recurrent strategy here, because the Input is given once and for all
one can use most of the basic Model methods, as fit, evaluate, predict, ... but not save, that is an other story

In any case, it is simpler for simple models (in keras/recurrent terminology) to use the fonctionnal API. If one needs to customize the layers, then just construct custom layers, not Model, that can still be constructed at the functional API level.

Upvotes: 0

Timbus Calin

Reputation: 15053

According to Francois Chollet, the answer to your question is the following(on comparing the (Functional+Sequential vs Model API):

You can do all these things (printing input / output shapes) in a Functional or Sequential model because these models are static graphs of layers.

In contrast, a subclassed model is a piece of Python code (a call method). There is no graph of layers here. We cannot know how layers are connected to each other (because that's defined in the body of call, not as an explicit data structure), so we cannot infer input / output shapes

A much more detailed explanation of these 3 types is available here: https://medium.com/tensorflow/what-are-symbolic-and-imperative-apis-in-tensorflow-2-0-dfccecb01021

An example of how you can still achieve this by mixing functional + model subclassing apis is here (credits to ixez on GitHub):

import tensorflow as tf
from tensorflow.keras.layers import Input

class MyModel(tf.keras.Model):
    def __init__(self):
        super().__init__()
        self.dense = tf.keras.layers.Dense(1)

    def call(self, inputs, **kwargs):
        return self.dense(inputs)

    def model(self):
        x = Input(shape=(1))
        return Model(inputs=[x], outputs=self.call(x))

MyModel().model().summary()

Upvotes: 5

TensorFlow 2.0 - Model subclassing : no input dimension

Answers (2)

Related Questions