Reputation: 1448
Updated Thanks to Q&A here, I am able to build a working step function with tensorflow. (See code below)
Now my question evolves into
How to make use of this
tf_stepy
activation function created intensorflow
to work inkeras
?
I tried to the following code to utilize tf_stepy
in keras, but not working:
from tensorflow_step_function import tf_stepy
def buy_hold_sell(x):
return tf_stepy(x)
get_custom_objects().update({'custom_activation': Activation(buy_hold_sell)})
Below is the step activation function created with tensorflow
# tensorflow_step_function.py
import tensorflow as tf
import keras.backend as K
from keras.backend.tensorflow_backend import _to_tensor
import numpy as np
def stepy(x):
if x < 0.33:
return 0.0
elif x > 0.66:
return 1.0
else:
return 0.5
import numpy as np
np_stepy = np.vectorize(stepy)
def d_stepy(x): # derivative
if x < 0.33:
return 0.0
elif x > 0.66:
return 1.0
else:
return 0.5
np_d_stepy = np.vectorize(d_stepy)
import tensorflow as tf
from tensorflow.python.framework import ops
np_d_stepy_32 = lambda x: np_d_stepy(x).astype(np.float32)
def py_func(func, inp, Tout, stateful=True, name=None, grad=None):
# Need to generate a unique name to avoid duplicates:
rnd_name = 'PyFuncGrad' + str(np.random.randint(0, 1E+8))
tf.RegisterGradient(rnd_name)(grad) # see _MySquareGrad for grad example
g = tf.get_default_graph()
with g.gradient_override_map({"PyFunc": rnd_name}):
return tf.py_func(func, inp, Tout, stateful=stateful, name=name)
def tf_d_stepy(x,name=None):
with ops.op_scope([x], name, "d_stepy") as name:
y = tf.py_func(np_d_stepy_32,
[x],
[tf.float32],
name=name,
stateful=False)
return y[0]
def stepygrad(op, grad):
x = op.inputs[0]
n_gr = tf_d_stepy(x)
return grad * n_gr
np_stepy_32 = lambda x: np_stepy(x).astype(np.float32)
def tf_stepy(x, name=None):
with ops.op_scope([x], name, "stepy") as name:
y = py_func(np_stepy_32,
[x],
[tf.float32],
name=name,
grad=stepygrad) # <-- here's the call to the gradient
return y[0]
with tf.Session() as sess:
x = tf.constant([0.2,0.7,0.4,0.6])
y = tf_stepy(x)
tf.initialize_all_variables().run()
print(x.eval(), y.eval(), tf.gradients(y, [x])[0].eval())
original question
I want to write an Activation function in keras based on the idea of step function, like the graph below
In numpy, such step activation function should behave as below:
def step_func(x, lower_threshold=0.33, higher_threshold=0.66):
# x is an array, and return an array
for index in range(len(x)):
if x[index] < lower_threshold:
x[index] = 0.0
elif x[index] > higher_threshold:
x[index] = 1.0
else:
x[index] = 0.5
I managed to transform the step function from numpy version to keras.tensor version. It works as below:
import tensorflow as tf
import keras.backend as K
from keras.backend.tensorflow_backend import _to_tensor
import numpy as np
def high_med_low(x, lower_threshold=0.33, higher_threshold=0.66):
"""
x: tensor
return a tensor
"""
# x_shape = K.get_variable_shape(x)
# x_flat = K.flatten(x)
x_array = K.get_value(x)
for index in range(x_array.shape[0]):
if x_array[index,0] < lower_threshold:
x_array[index,0] = 0.0
elif x_array[index,0] > higher_threshold:
x_array[index,0] = 1.0
else:
x_array[index,0] = 0.5
# x_return = x_array.reshape(x_shape)
return _to_tensor(x_array, x.dtype.base_dtype)
x = K.ones((10,1)) * 0.7
print(high_med_low(x))
# the following line of code is used in building a model with keras
get_custom_objects().update({'custom_activation': Activation(high_med_low)})
Although this function works on its own, it causes error when applied to a model. My suspicion is that as a activation layer, it should not access each element value of a tensor.
If so, then what is the right way of writing this step activation function?
Thanks!
Upvotes: 2
Views: 10409
Reputation: 1
This step function work in tensorflow because tensorflow supply a framework in ops, when you call RegisterGradient, it use the user defined function as gridient function. However, when you use it in keras, as you described, you didn't add the user-defined gradient function to (let's say) keras framework. So it wouldn't work. Then how to make it work. keras use tensorflow as backend, so you could always call functions in keras.backend as you call function in tensorflow. So implement the step function and its gradients function with keras.backend if you could.
Upvotes: 0
Reputation: 11543
This will not work. The non-linearities still have to be differentiable. A step function is not differentiable so the gradients can not be computed.
You can always try to build a differentiable function that approximates the step. This is already what a sigmoid or a tanh do for a "one-step" version.
I hope this helps a bit :)
Upvotes: 3