sandboxj
sandboxj

Reputation: 1254

Is there a tensorflow way to extract/save mean and std used for normalization?

I am normalizing my input training data using data_norm = tf.nn.l2_normalize(data, 0).

The data is of shape [None, 4]. Each column is a feature. It might look like this:

data = [[-3., 0.2, 1.6, 0.5], 
        [3.6, 1.5, -1.9, 0.71], 
       ...]

I understand that given normalization in the training set, the test set should be normalized, too, but using the std and mean from the training set. (I assume this applies also during the actual usage of the NN, i.e. input should be normalized using the training set mean, std before feeding into the NN).

Is there a way to extract/save the mean, std used for normalization from this function, so I can normalize my test set using the same mean, std used for normalizing the training data? I know how to save the weights etc with saver.save(sess, "checkpoints/" + save_id) . Is there a way to save/load std, mean like this?

Upvotes: 2

Views: 1185

Answers (3)

Chris
Chris

Reputation: 999

I am not an expert in tensorflow but I am happy to share what worked for me. I made two additional variables before starting the session for training:

train_mean = tf.Variable(np.mean(X_train), name='train_mean', dtype=tf.float64)
train_std = tf.Variable(np.std(X_train), name='train_std', dtype=tf.float64)

# initialize other variables here

with tf.session() as sess:
    sess.run(init)

    # normalize data
    train_mean_py = sess.run(train_mean)
    train_std_py = sess.run(train_std)
    X_train = (X_train - train_mean_py) / train_std_py
    X_test = (X_test - train_mean_py) / train_std_py

    # do training

    # save model

When I recover the model later in a different script I do the following

# define the variables that have to be later recovered
train_mean = tf.Variable(0., name='train_mean', dtype=tf.float64)
train_std = tf.Variable(0., name='train_std', dtype=tf.float64)


with tf.Session() as sess:
    sess.run(init)

    saver.restore(sess, "./trained_models/{0}.ckpt".format(model_name))

    # normalize data
    train_mean_py = sess.run(train_mean)
    train_std_py = sess.run(train_std)
    X_test = (X_test - train_mean_py) / train_std_py

Upvotes: 1

Maashu
Maashu

Reputation: 323

From the documentation:

For a 1-D tensor with dim = 0, computes

output = x / sqrt(max(sum(x**2), epsilon))

epsilon is defaulted to 1e-12, or 10 to the -12.

So you could just apply this same function to the test data.

HTH!

Cheers,

-maashu

Upvotes: 0

Ishant Mrinal
Ishant Mrinal

Reputation: 4918

tf.nn.l2_normalize use the real_time mean of the input data, you can't use this function to use training data mean or std. l2_normalize_docs

output_l2_normalize = input / sqrt(max(sum(input**2), epsilon))

Note: Since you are trying to normalize the input data you may precompute the global(training dataset) mean and std and write your own function to normalize.

Upvotes: 1

Related Questions