Reputation: 1254
I am normalizing my input training data using data_norm = tf.nn.l2_normalize(data, 0)
.
The data is of shape [None, 4]
. Each column is a feature. It might look like this:
data = [[-3., 0.2, 1.6, 0.5],
[3.6, 1.5, -1.9, 0.71],
...]
I understand that given normalization in the training set, the test set should be normalized, too, but using the std
and mean
from the training set. (I assume this applies also during the actual usage of the NN, i.e. input should be normalized using the training set mean
, std
before feeding into the NN).
Is there a way to extract/save the mean
, std
used for normalization from this function, so I can normalize my test set using the same mean, std used for normalizing the training data?
I know how to save the weights etc with saver.save(sess, "checkpoints/" + save_id)
. Is there a way to save/load std, mean like this?
Upvotes: 2
Views: 1185
Reputation: 999
I am not an expert in tensorflow but I am happy to share what worked for me. I made two additional variables before starting the session for training:
train_mean = tf.Variable(np.mean(X_train), name='train_mean', dtype=tf.float64)
train_std = tf.Variable(np.std(X_train), name='train_std', dtype=tf.float64)
# initialize other variables here
with tf.session() as sess:
sess.run(init)
# normalize data
train_mean_py = sess.run(train_mean)
train_std_py = sess.run(train_std)
X_train = (X_train - train_mean_py) / train_std_py
X_test = (X_test - train_mean_py) / train_std_py
# do training
# save model
When I recover the model later in a different script I do the following
# define the variables that have to be later recovered
train_mean = tf.Variable(0., name='train_mean', dtype=tf.float64)
train_std = tf.Variable(0., name='train_std', dtype=tf.float64)
with tf.Session() as sess:
sess.run(init)
saver.restore(sess, "./trained_models/{0}.ckpt".format(model_name))
# normalize data
train_mean_py = sess.run(train_mean)
train_std_py = sess.run(train_std)
X_test = (X_test - train_mean_py) / train_std_py
Upvotes: 1
Reputation: 323
From the documentation:
For a 1-D tensor with dim = 0, computes
output = x / sqrt(max(sum(x**2), epsilon))
epsilon is defaulted to 1e-12, or 10 to the -12.
So you could just apply this same function to the test data.
HTH!
Cheers,
-maashu
Upvotes: 0
Reputation: 4918
tf.nn.l2_normalize
use the real_time mean of the input data, you can't use this function to use training data mean
or std
.
l2_normalize_docs
output_l2_normalize = input / sqrt(max(sum(input**2), epsilon))
Note: Since you are trying to normalize the input data you may precompute the global(training dataset) mean
and std
and write your own function to normalize.
Upvotes: 1