Reputation: 1448

How to write step_function as an activation function in keras?

Updated Thanks to Q&A here, I am able to build a working step function with tensorflow. (See code below)

Now my question evolves into

How to make use of this tf_stepy activation function created in tensorflow to work in keras?

I tried to the following code to utilize tf_stepy in keras, but not working:

from tensorflow_step_function import tf_stepy

def buy_hold_sell(x):
    return tf_stepy(x)

get_custom_objects().update({'custom_activation': Activation(buy_hold_sell)})

Below is the step activation function created with tensorflow

# tensorflow_step_function.py
import tensorflow as tf
import keras.backend as K
from keras.backend.tensorflow_backend import _to_tensor
import numpy as np

def stepy(x):
    if x < 0.33:
        return 0.0
    elif x > 0.66:
        return 1.0
    else:
        return 0.5

import numpy as np
np_stepy = np.vectorize(stepy)

def d_stepy(x): # derivative
    if x < 0.33:
        return 0.0
    elif x > 0.66:
        return 1.0
    else:
        return 0.5
np_d_stepy = np.vectorize(d_stepy)

import tensorflow as tf
from tensorflow.python.framework import ops

np_d_stepy_32 = lambda x: np_d_stepy(x).astype(np.float32)

def py_func(func, inp, Tout, stateful=True, name=None, grad=None):

    # Need to generate a unique name to avoid duplicates:
    rnd_name = 'PyFuncGrad' + str(np.random.randint(0, 1E+8))

    tf.RegisterGradient(rnd_name)(grad)  # see _MySquareGrad for grad example
    g = tf.get_default_graph()
    with g.gradient_override_map({"PyFunc": rnd_name}):
        return tf.py_func(func, inp, Tout, stateful=stateful, name=name)

def tf_d_stepy(x,name=None):
    with ops.op_scope([x], name, "d_stepy") as name:
        y = tf.py_func(np_d_stepy_32,
                        [x],
                        [tf.float32],
                        name=name,
                        stateful=False)
        return y[0]

def stepygrad(op, grad):
    x = op.inputs[0]

    n_gr = tf_d_stepy(x)
    return grad * n_gr

np_stepy_32 = lambda x: np_stepy(x).astype(np.float32)

def tf_stepy(x, name=None):

    with ops.op_scope([x], name, "stepy") as name:
        y = py_func(np_stepy_32,
                        [x],
                        [tf.float32],
                        name=name,
                        grad=stepygrad)  # <-- here's the call to the gradient
        return y[0]

with tf.Session() as sess:

    x = tf.constant([0.2,0.7,0.4,0.6])
    y = tf_stepy(x)
    tf.initialize_all_variables().run()

    print(x.eval(), y.eval(), tf.gradients(y, [x])[0].eval())

original question

I want to write an Activation function in keras based on the idea of step function, like the graph below

In numpy, such step activation function should behave as below:

def step_func(x, lower_threshold=0.33, higher_threshold=0.66):

    # x is an array, and return an array

    for index in range(len(x)):
        if x[index] < lower_threshold:
            x[index] = 0.0
        elif x[index] > higher_threshold:
            x[index] = 1.0
        else:
            x[index] = 0.5

I managed to transform the step function from numpy version to keras.tensor version. It works as below:

import tensorflow as tf
import keras.backend as K
from keras.backend.tensorflow_backend import _to_tensor
import numpy as np
def high_med_low(x, lower_threshold=0.33, higher_threshold=0.66):
    """
    x: tensor
    return a tensor
    """
    # x_shape = K.get_variable_shape(x)
    # x_flat = K.flatten(x)
    x_array = K.get_value(x)
    for index in range(x_array.shape[0]):
        if x_array[index,0] < lower_threshold:
            x_array[index,0] = 0.0
        elif x_array[index,0] > higher_threshold:
            x_array[index,0] = 1.0
        else:
            x_array[index,0] = 0.5

    # x_return = x_array.reshape(x_shape)
    return _to_tensor(x_array, x.dtype.base_dtype)

x = K.ones((10,1)) * 0.7
print(high_med_low(x))

# the following line of code is used in building a model with keras
get_custom_objects().update({'custom_activation': Activation(high_med_low)})

Although this function works on its own, it causes error when applied to a model. My suspicion is that as a activation layer, it should not access each element value of a tensor.

If so, then what is the right way of writing this step activation function?

Thanks!

Upvotes: 2

Answers (2)

happynoom

Reputation: 1

This step function work in tensorflow because tensorflow supply a framework in ops, when you call RegisterGradient, it use the user defined function as gridient function. However, when you use it in keras, as you described, you didn't add the user-defined gradient function to (let's say) keras framework. So it wouldn't work. Then how to make it work. keras use tensorflow as backend, so you could always call functions in keras.backend as you call function in tensorflow. So implement the step function and its gradients function with keras.backend if you could.

Upvotes: 0

Nassim Ben

Reputation: 11543

This will not work. The non-linearities still have to be differentiable. A step function is not differentiable so the gradients can not be computed.

You can always try to build a differentiable function that approximates the step. This is already what a sigmoid or a tanh do for a "one-step" version.

I hope this helps a bit :)

Upvotes: 3

How to write step_function as an activation function in keras?

Answers (2)

Related Questions