D.Laupheimer
D.Laupheimer

Reputation: 1074

Do I have to preprocess test data using neural networks?

I am using Keras (version 2.0.0) and I'd like to make use of pretrained models like e.g. VGG16. In order to get started, I ran the example of the [Keras documentation site ][https://keras.io/applications/] for extracting features with VGG16:

from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
import numpy as np

model = VGG16(weights='imagenet', include_top=False)

img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

features = model.predict(x)

The used preprocess_input() function bothers me (the function does Zero-centering by mean pixel what can be seen by looking at the source code).

Do I really have to preprocess input data (validation/test data) before using a trained model?

a) If yes, one can conclude that you always have to be aware of what preprocessing steps have been performed during training phase?!

b) If no: Does preprocessing of validation/test data cause a bias?

I appreciate your help.

Upvotes: 3

Views: 1444

Answers (1)

Dref360
Dref360

Reputation: 628

Yes you should use the preprocessing step. You can retrain the model without it but the first layers will learn to center your datas so this is a waste of parameters.

If you do not recenter your performances will suffer.

Great thread on reddit : https://www.reddit.com/r/MachineLearning/comments/3q7pjc/why_is_removing_the_mean_pixel_value_from_each/

Upvotes: 1

Related Questions