Reputation: 2478
I was reading the TensorFlow 2.0 Tutorial and I came across model subclassing to create TensorFlow 2.0 models.
The code I found was:
class MyModel(Model):
def __init__(self):
super(MyModel, self).__init__()
self.conv1 = Conv2D(32, 3, activation='relu')
self.flatten = Flatten()
self.d1 = Dense(128, activation='relu')
self.d2 = Dense(10, activation='softmax')
def call(self, x):
x = self.conv1(x)
x = self.flatten(x)
x = self.d1(x)
return self.d2(x)
# Create an instance of the model
model = MyModel()
Now, in this code, my confusion is, the author of the code doesn't define the inputs?
There is no-
self.input_layer = Input(
shape = (28, 28)
)
# OR-
self.conv1 = Conv2D(32, 3, activation='relu', input_dim = (28, 28)
How does the defined model then know how many attributes/features to expect from the training data?
Thanks
Upvotes: 2
Views: 1657
Reputation: 329
In tensorflow/keras, a Model is a succession of layers, that can serve as a sub-block for more complicated networks. The basic notion there is recurrence.
For most machine learning libraries, though, a model is an algorithm that takes input, process it and generate an output (well, it's an algorithm!), that is supposed to be related to the input via decision theory. As such it can be trained and tested on a given dataset, or at least using dataset prescriptions (size and type of datas for instance). The basic notion there is to render a static code.
So the two visions (recurrent construction versus static construction) are somehow incompatible, but turaround may be found. The way I did it is summarized in the below example
class MultiLayerPerceptron(Model):
def __init__(
self,
d_input: int = 32,
d_hidden: int = 64,
d_output: int = 16,
dropout: float = 0.0):
self.d_input = int(d_input)
self.d_hidden = int(d_hidden)
self.d_output = int(d_output)
self.dropout = float(dropout)
inputs = Input(shape=(self.d_input,), name='input')
x = Dense(self.d_hidden, activation='relu', name='hidden')(inputs)
x = Dropout(self.dropout, name='dropout')(x)
outputs = Dense(self.d_output, activation='sigmoid', name='output')(x)
super().__init__(inputs=inputs, outputs=outputs)
loss_function = tf.keras.losses.BinaryCrossentropy(from_logits=False)
optimizer = tf.keras.optimizers.Adam()
super().compile(
optimizer=optimizer, loss=loss_function, metrics=['accuracy'])
return None
def predict(self, x):
y = super().predict(x)
return tf.round(y).numpy().astype(int)
def transform(self, x):
return self(x)
which --I hope-- is sufficiently clear. I simply post-poned the call of super().__init__
to the point where the network is constructed. One got a striking resemblance with scikit-learn/machine-learning terminology then.
Nevertheless, note that
Model
methods, as fit, evaluate, predict, ... but not save, that is an other storyIn any case, it is simpler for simple models (in keras/recurrent terminology) to use the fonctionnal API. If one needs to customize the layers, then just construct custom layers, not Model, that can still be constructed at the functional API level.
Upvotes: 0
Reputation: 14983
According to Francois Chollet, the answer to your question is the following(on comparing the (Functional+Sequential vs Model API):
You can do all these things (printing input / output shapes) in a Functional or Sequential model because these models are static graphs of layers.
In contrast, a subclassed model is a piece of Python code (a call method). There is no graph of layers here. We cannot know how layers are connected to each other (because that's defined in the body of call, not as an explicit data structure), so we cannot infer input / output shapes
A much more detailed explanation of these 3 types is available here: https://medium.com/tensorflow/what-are-symbolic-and-imperative-apis-in-tensorflow-2-0-dfccecb01021
An example of how you can still achieve this by mixing functional + model subclassing apis is here (credits to ixez on GitHub):
import tensorflow as tf
from tensorflow.keras.layers import Input
class MyModel(tf.keras.Model):
def __init__(self):
super().__init__()
self.dense = tf.keras.layers.Dense(1)
def call(self, inputs, **kwargs):
return self.dense(inputs)
def model(self):
x = Input(shape=(1))
return Model(inputs=[x], outputs=self.call(x))
MyModel().model().summary()
Upvotes: 5