tf.keras
is an apparently well integrated version of Keras, which takes advantage of most TensorFlow efficiencies and optimizations, including distributed training and model deployment, while still abstracting away coding details.
As of TensorFlow 2.0, all the use and creation of layers and models is made through tf.keras
. The remaining TensorFlow framework consists of both more intricate details like certain data processing methods, as well as other tools like uncertainty estimation and model deployment.
TensorFlow allows the creation of custom layers, which can then work similarly to inbuilt ones like Linear
and Conv2D
. To do this, one should extend the keras.layers.Layer
class, ideally defining the __init__
(initialization), build
(weights definition) and call
(feedforward operation). Here's an example:
class Linear(keras.layers.Layer):
def __init__(self, units=32):
super(Linear, self).__init__()
self.units = units
def build(self, input_shape):
self.w = self.add_weight(
shape=(input_shape[-1], self.units),
initializer="random_normal",
trainable=True,
)
self.b = self.add_weight(
shape=(self.units,), initializer="random_normal", trainable=True
)
def call(self, inputs):
return tf.matmul(inputs, self.w) + self.b
Additionally, one can define a custom model. The advantage of doing so is to be able to edit the training (fit
), evaluation (evaluation
), output (predict
) and saving (save
and save_weights
) methods. This is done by extending the tf.keras.Model
class. Example:
class ResNet(tf.keras.Model):
def __init__(self):
super(ResNet, self).__init__()
self.block_1 = ResNetBlock()
self.block_2 = ResNetBlock()
self.global_pool = layers.GlobalAveragePooling2D()
self.classifier = Dense(num_classes)
def call(self, inputs):
x = self.block_1(inputs)
x = self.block_2(x)
x = self.global_pool(x)
return self.classifier(x)
resnet = ResNet()
dataset = ...
resnet.fit(dataset, epochs=10)
resnet.save(filepath)
So if you're wondering, "should I use the
Layer
class or theModel
class?", ask yourself: will I need to callfit()
on it? Will I need to callsave()
on it? If so, go withModel
. If not (either because your class is just a block in a bigger system, or because you are writing training & saving code yourself), useLayer
.
To be confirmed: Custom layers might not work directly with TensorFlow's usual model constructions like the sequential model. When there are custom layers involved, we might be required to define a custom model as well, even if we don't need to change the fit
method or others (we can just __init__
and call
methods).
Making new Layers and Models via subclassing | TensorFlow Core
There might be times when it's useful to define a custom training pipeline. In this case, a couple of nuances should be taken into account:
Ideally, one should define two functions with the @tf.function
decorator, one corresponding to a single training step and another for a single testing step. This @tf.function
decorator is useful to improve performance and to allow saving the model to be exported. But beware that debugging is easier without it.
Only use
tf.function
to decorate high-level computations - for example, one step of training or the forward pass of your model.
The expression with tf.GradientTape() as tape:
is needed during training, to keep track of the gradients. It does basically the opposite of PyTorch's with torch.no_grad()
.
In order to run the optimization step, we need to run tape.gradient
and the optimizer's apply_gradients
method.
The training
boolean parameter of each model's predict
method can switch the operating logic of some components which vary whether it is training or testing. For example, dropout should be activated during training by setting model(input, training=True)
and then shut off during testing by setting model(input, training=False)
. This is similar to PyTorch's settings of train
and eval
.
When starting a new epoch, the loss and the remaining metrics should be reset through reset_states
.