Reputation: 3281
I use simple iris data which have 4 features. And I want to do some preprocessing steps before entering the network. For example, I want my NN only receive 3 features which are an average of two consecutive original features.
# x shape is 120 data x 4 features
tmp = np.zeros((x.shape[0],x.shape[1]-1))
for i in range(x.shape[1]-1):
tmp[:,i] = (x[:,i]+x[:,i+1])/2.
x = deepcopy(tmp) # after preprocess its shape 120 x 3 features
I've tried to add those steps in input_function
and change the definition of all feature_columns
shape to 3:
def input_function(x, y, is_train):
tmp = np.zeros((x.shape[0],x.shape[1]-1))
for i in range(x.shape[1]-1):
tmp[:,i] = (x[:,i]+x[:,i+1])/2.
x = deepcopy(tmp)
dict_x = { "thisisinput" : x }
dataset = tf.data.Dataset.from_tensor_slices((
dict_x, y
))
if is_train:
dataset = dataset.shuffle(num_train).repeat(num_epoch).batch(num_train)
else:
dataset = dataset.batch(num_test)
return dataset
The way I train the classifier:
feature_columns = [
tf.feature_column.numeric_column(key="featurename",shape=3),
]
classifier = tf.estimator.DNNClassifier(
feature_columns=feature_columns,
hidden_units=[50, 20],
n_classes=3,
optimizer=tf.train.GradientDescentOptimizer(0.001),
activation_fn=tf.nn.relu,
model_dir = 'modeliris2/'
)
classifier.train(
input_fn=lambda:input_function(xtrain, ytrain, True)
)
my serving input function:
def my_serving_input_fn2():
input_data = {
"featurename" : tf.placeholder(tf.float32, [None, 3], name='inputtensors')
}
return tf.estimator.export.ServingInputReceiver(input_data, input_data)
It works when I run it, But if I freeze the model and then use it to predict, it doesn't work. it said:
ValueError: Cannot feed value of shape (1, 4) for Tensor 'import/inputtensors:0', which has shape '(?, 3)'
If I change feature_columns
on my_serving_input_fn
to [None, 4], it's still got an error after freezing the model:
InvalidArgumentError (see above for traceback): Input to reshape is a tensor with 4 values, but the requested shape has 3
My question, If I need to include any preprocessing or feature engineering steps (like MFCC in signal preprocessing, etc.) to my model where should I put it? Is my approach correct? why did an error occur? or Is there a better solution?
and an additional question, what if in my preprocessing steps I need to include external files (like stopwords list in text preprocessing, etc.), is it still possible to include those files for preprocessing using TF lite?
Upvotes: 1
Views: 898
Reputation: 134
Technically you can put preprocessing step in two places. I will use tflite
as example.
preprocessing outside the model. This means you have a mfcc in your driver:
model = new model(CNN, RNN, ...)
while(stream) {
energy = mfcc(audio)
model.invoke(energy)
}
If the preprocessing step is an Op already (usually it is not...), you can include the Op in your model:
model = new model(MFCC, CNN, RNN, ...)
while(stream) {
model.invoke(audio)
}
That being said, option 1 is most approachable. Hope it helps.
Upvotes: 1
Reputation: 1
The preprocessing happens in python in this case. So, you will be able to use any python primitives if you are invoking the TF Lite graph in python.
Upvotes: 0