malioboro
malioboro

Reputation: 3281

How to add preprocessing steps in TF Lite

I use simple iris data which have 4 features. And I want to do some preprocessing steps before entering the network. For example, I want my NN only receive 3 features which are an average of two consecutive original features.

# x shape is 120 data x 4 features
tmp = np.zeros((x.shape[0],x.shape[1]-1))
for i in range(x.shape[1]-1):
    tmp[:,i] = (x[:,i]+x[:,i+1])/2.
x = deepcopy(tmp) # after preprocess its shape 120 x 3 features

I've tried to add those steps in input_function and change the definition of all feature_columns shape to 3:

def input_function(x, y, is_train):

    tmp = np.zeros((x.shape[0],x.shape[1]-1))
    for i in range(x.shape[1]-1):
        tmp[:,i] = (x[:,i]+x[:,i+1])/2.
    x = deepcopy(tmp)

    dict_x = { "thisisinput" : x }

    dataset = tf.data.Dataset.from_tensor_slices((
        dict_x, y
    ))

    if is_train:
        dataset = dataset.shuffle(num_train).repeat(num_epoch).batch(num_train)
    else:   
        dataset = dataset.batch(num_test)

    return dataset

The way I train the classifier:

feature_columns = [
    tf.feature_column.numeric_column(key="featurename",shape=3),
]

classifier = tf.estimator.DNNClassifier(
    feature_columns=feature_columns,
    hidden_units=[50, 20],
    n_classes=3,
    optimizer=tf.train.GradientDescentOptimizer(0.001),
    activation_fn=tf.nn.relu,
    model_dir = 'modeliris2/'
)

classifier.train(
    input_fn=lambda:input_function(xtrain, ytrain, True)
)

my serving input function:

def my_serving_input_fn2():
    input_data = {
        "featurename" : tf.placeholder(tf.float32, [None, 3], name='inputtensors')
    }
    return tf.estimator.export.ServingInputReceiver(input_data, input_data)

It works when I run it, But if I freeze the model and then use it to predict, it doesn't work. it said:

ValueError: Cannot feed value of shape (1, 4) for Tensor 'import/inputtensors:0', which has shape '(?, 3)'

If I change feature_columns on my_serving_input_fn to [None, 4], it's still got an error after freezing the model:

InvalidArgumentError (see above for traceback): Input to reshape is a tensor with 4 values, but the requested shape has 3

My question, If I need to include any preprocessing or feature engineering steps (like MFCC in signal preprocessing, etc.) to my model where should I put it? Is my approach correct? why did an error occur? or Is there a better solution?

and an additional question, what if in my preprocessing steps I need to include external files (like stopwords list in text preprocessing, etc.), is it still possible to include those files for preprocessing using TF lite?

Upvotes: 1

Views: 898

Answers (2)

J.L.
J.L.

Reputation: 134

Technically you can put preprocessing step in two places. I will use tflite as example.

  1. preprocessing outside the model. This means you have a mfcc in your driver:

    model = new model(CNN, RNN, ...)
    while(stream) {
       energy = mfcc(audio)
       model.invoke(energy)
    }
    
  2. If the preprocessing step is an Op already (usually it is not...), you can include the Op in your model:

    model = new model(MFCC, CNN, RNN, ...)
    while(stream) {
        model.invoke(audio)
    }
    

That being said, option 1 is most approachable. Hope it helps.

Upvotes: 1

achowdhery
achowdhery

Reputation: 1

The preprocessing happens in python in this case. So, you will be able to use any python primitives if you are invoking the TF Lite graph in python.

Upvotes: 0

Related Questions