Hardian Lawi
Hardian Lawi

Reputation: 608

Changing activation function of a keras layer w/o replacing whole layer

I am trying to change the activation function of the last layer of a keras model without replacing the whole layer. In this case, only the softmax function

import keras.backend as K
from keras.models import load_model
from keras.preprocessing.image import load_img, img_to_array
import numpy as np

model = load_model(model_path)  # Load any model
img = load_img(img_path, target_size=(224, 224))
img = img_to_array(img)
print(model.predict(img))

My output:

array([[1.53172877e-07, 7.13159451e-08, 6.18941920e-09, 8.52070968e-07,
    1.25813088e-07, 9.98970985e-01, 1.48254022e-08, 6.09538893e-06,
    1.16236095e-07, 3.91888688e-10, 6.29304608e-08, 1.79565995e-09,
    1.75571788e-08, 1.02110009e-03, 2.14380114e-09, 9.54465733e-08,
    1.05938483e-07, 2.20544337e-07]], dtype=float32)

Then I do this to change the activation:

model.layers[-1].activation = custom_softmax
print(model.predict(test_img))

and the output I got is exactly the same. Any ideas how to fix? Thanks!

You could try to use the custom_softmax below:

def custom_softmax(x, axis=-1):
"""Softmax activation function.
# Arguments
    x : Tensor.
    axis: Integer, axis along which the softmax normalization is applied.
# Returns
    Tensor, output of softmax transformation.
# Raises
    ValueError: In case `dim(x) == 1`.
"""
ndim = K.ndim(x)
if ndim >= 2:
    return K.zeros_like(x)
else:
    raise ValueError('Cannot apply softmax to a tensor that is 1D')

Upvotes: 9

Views: 2712

Answers (1)

Julio Cezar Silva
Julio Cezar Silva

Reputation: 2456

At the current state of things there's no official, clean way to do that. As pointed by @layser in the comments, the Tensorflow graph isn't being updated - which results in the lack of change in your output. One option is to use keras-vis' utils. My recommendation is to isolate that in your own utils.py, like so:

from vis.utils.utils import apply_modifications

def update_layer_activation(model, activation, index=-1):
    model.layers[index].activation = activation
    return apply_modifications(model)

Which would lead to a similar use:

model = update_layer_activation(model, custom_softmax)

If you follow the given link, you'll see what they do is quite simple: they save the model to a temporary path, then load it back and return, finally deleting the temp file.

Upvotes: 5

Related Questions