Andrey Noskov
Andrey Noskov

Reputation: 115

Printing TensorFlow and NumPy values to stdout

I have some problem with printing numpy.float32() value to stdout. Here is the code:

import numpy as np
import tensorflow as tf

n_samples = 1000
batch_size = 100
num_steps = 20000

x_data = np.random.uniform(1, 10, (n_samples, 1))
y_data = 2 * x_data + 1 + np.random.normal(0, 2, (n_samples, 1))

x = tf.placeholder(tf.float32, shape=(batch_size, 1))
y = tf.placeholder(tf.float32, shape=(batch_size, 1))

with tf.variable_scope('linear-regression'):
    k = tf.Variable(tf.random_normal((1, 1)), name='slope')
    b = tf.Variable(tf.zeros(1,), name='bias')

y_pred = tf.matmul(x, k) + b
loss = tf.reduce_sum((y - y_pred) ** 2)
optimizer = tf.train.GradientDescentOptimizer(0.05).minimize(loss)

display_step = 5000
with tf.Session() as session:
    session.run(tf.global_variables_initializer())
    for i in range(num_steps):
        indices = np.random.choice(n_samples, batch_size)
        x_batch, y_batch = x_data[indices], y_data[indices]
        _, loss_val, k_val, b_val = session.run((optimizer, loss, k, b), feed_dict={x: x_batch, y: y_batch})
        if (i + 1) % display_step == 0:
            print(f'Epoch {i+1}: loss = {loss_val.item():.3f}, k = {np.sum(k_val).item():.3f}, b = {np.sum(b_val).item():.3f}')

When i'm trying to print values in last string I get something like this:

Epoch 5000: loss = nan, k = nan, b = nan
Epoch 10000: loss = nan, k = nan, b = nan
Epoch 15000: loss = nan, k = nan, b = nan

I'm using Visual Code (Windows 10). In debug mode i'm trying to print loss_val value and convert it to Python native float value and print after that, but getting None value.

Thanks for your help =)

P.S. TensorFlow 1.4.0, NumPy 1.14, Windows 10, Visual Code as IDE.

Update:

In debug mode stoppped in if-statment, I have tried

print(1)

and get:

None
1

What am i doing wrong? Looks like something redefine print().

Upvotes: 0

Views: 219

Answers (1)

Maxim
Maxim

Reputation: 53768

You're seeing NaNs because the values in the network are exploding very quickly and become too large to fit in float. This explosion is caused primarily by your hyper-parameters:

  • k initial value is too large, reduce the standard deviation, e.g.:

    k = tf.Variable(tf.random_normal((1, 1), stddev=0.001), name='slope')
    
  • the learning rate is too high as well, try 0.01 instead of 0.05;

  • you should use tf.reduce_mean instead of tf.reduce_sum to keep the loss and the gradient adjusted to the batch size.

The result code:

x = tf.placeholder(tf.float32, shape=(batch_size, 1))
y = tf.placeholder(tf.float32, shape=(batch_size, 1))

with tf.variable_scope('linear-regression'):
  k = tf.Variable(tf.random_normal((1, 1), stddev=0.001), name='slope')
  b = tf.Variable(tf.zeros(1, ), name='bias')

y_pred = tf.matmul(x, k) + b
loss = tf.reduce_mean((y - y_pred) ** 2)
optimizer = tf.train.GradientDescentOptimizer(0.01).minimize(loss)

PS. You should also consider normalizing the input if you want better results.

Upvotes: 1

Related Questions