tensorflow import causing numpy calculation errors

Question

I'm learning the basics of TensorFlow thru an example of linear regression. Performing the linear regression with scikit-learn works well:

import numpy as np
from sklearn.datasets import fetch_california_housing
from sklearn.linear_model import LinearRegression

housing = fetch_california_housing()

lin_reg = LinearRegression()
lin_reg.fit(housing.data, housing.target.reshape(-1, 1))

print(np.r_[lin_reg.intercept_.reshape(-1, 1), lin_reg.coef_.T])

returning the following results:

[[ -3.69419202e+01]
 [  4.36693293e-01]
 [  9.43577803e-03]
 [ -1.07322041e-01]
 [  6.45065694e-01]
 [ -3.97638942e-06]
 [ -3.78654265e-03]
 [ -4.21314378e-01]
 [ -4.34513755e-01]]

Performing the same using numpy (Normal equation) also works fine:

m, n = housing.data.shape
housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data]

X = housing_data_plus_bias
y = housing.target.reshape(-1, 1)
theta_numpy = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)

print(theta_numpy)

Output:

[[ -3.69419202e+01]
 [  4.36693293e-01]
 [  9.43577803e-03]
 [ -1.07322041e-01]
 [  6.45065694e-01]
 [ -3.97638942e-06]
 [ -3.78654265e-03]
 [ -4.21314378e-01]
 [ -4.34513755e-01]]

However, when I import TensorFlow prior running the linear regression, I get variable and inaccurate results:

import tensorflow as tf
import numpy as np
from sklearn.datasets import fetch_california_housing
from sklearn.linear_model import LinearRegression

housing = fetch_california_housing()

lin_reg = LinearRegression()
lin_reg.fit(housing.data, housing.target.reshape(-1, 1))

print(np.r_[lin_reg.intercept_.reshape(-1, 1), lin_reg.coef_.T])

yielding the following results (different values every time):

[[  2.91247440e+32]
 [ -1.62971964e+11]
 [  1.42425463e+14]
 [ -4.82459003e+16]
 [ -1.33258747e+17]
 [ -2.04315813e+29]
 [  5.51179654e+14]
 [  5.92729561e+20]
 [  8.86284674e+21]]

If I run either of the calculations before importing tensorflow, then import tensorflow and repeat the calculations again, I get the correct results.

Any idea what is the cause and how I can ensure I get correct results from numpy/scikit-learn after importing TensorFlow?

I'm running Python 3.5.4 from Anaconda 4.3.30 on Ubuntu with tensorflow-gpu.

numpy version: 1.12.1
tensorflow version: 1.3.0

drgfreeman · Accepted Answer

The Anaconda distribution uses Intel's Math Kernel Libraries (MKL) by default which seems to result in multiple issues with Numpy and SciPy when used in conjunction with TensorFlow as reported in this issue and in other referenced issues.

Re-installing Numpy and SciPy from pip resolves the issue:

First, create a new environment with the required packages using conda:

$ conda create --name env_name python=3.5 tensorflow-gpu scikit-learn

Activate the environment:

$ source activate env_name

Re-install Numpy and SciPy using pip:

$ pip install --ignore-installed --upgrade numpy scipy

The drawback to this is that you cannot benefit from the performence increase provided by the MKL. As an example, a support vector machine built with Scikit-Learn that takes 6 minutes to train with MKL trained in 11 minutes in an environment without MKL. You can however create another environment that has MKL (by default) to use when TensorFlow is not required.

tensorflow import causing numpy calculation errors

Answers (1)

Related Questions