O.rka
O.rka

Reputation: 30747

Time comparison for TensorFlow operation and Numpy multiplication

I've been trying to optimize my computations; and for most operations that I've tried, tensorflow is much faster. I'm trying to do a fairly simple operation...Transform a matrix (multiply each value by 1/2 and then add 1/2 to that value).

With the help of @mrry , I was able to do these operations in tensorflow. However to my surprise, the numpy method was significantly faster?!

tensorflow seems like an extremely useful tool for data scientists and I think this could help clarify it's use and advantages.

Am I not using tensorflow data structures and operations in the most efficient way? I'm not sure how non-tensorflow methods would be faster. I'm using a Mid-2012 Macbook Air 4GB RAM

trans1 is the tensorflow version while trans2 is numpy. DF_var is a pandas dataframe object

import pandas as pd
import tensorflow as tf
import numpy as np

def trans1(DF_var):
    #Total user time is 31.8532807827 seconds

    #Create placeholder 
    T_feed = tf.placeholder(tf.float32,DF_var.shape)

    #Matrix transformation
    T_signed = tf.add(
                      tf.constant(0.5,dtype=tf.float32),
                      tf.mul(T_feed,tf.constant(0.5,dtype=tf.float32))
                      ) 

    #Get rid of of top triangle
    T_ones = tf.constant(np.tril(np.ones(DF_var.shape)),dtype=tf.float32)
    T_tril = tf.mul(T_signed,T_ones)

    #Start Graph Session
    sess = tf.Session()

    DF_signed = pd.DataFrame(
                          sess.run(T_tril,feed_dict={T_feed: DF_var.as_matrix()}),
                          columns = DF_var.columns, index = DF_var.index
                          )
    #Close Graph Session
    sess.close() 
    return(DF_signed)

def trans2(DF_var):
    #Total user time is 1.71233415604 seconds
    M_computed = np.tril(np.ones(DF_var.shape))*(0.5 + 0.5*DF_var.as_matrix())
    DF_signed = pd.DataFrame(M_computed,columns=DF_var.columns, index=DF_var.index)
    return(DF_signed)

My timing method was:

import time
start_time = time.time()
#operation
print str(time.time() - start_time)

Upvotes: 0

Views: 1991

Answers (1)

Salvador Dali
Salvador Dali

Reputation: 222869

Your results are compatible with the benchmarks from another guy.

In his benchmark he compared NumPy, Theano and Tensorflow on

an Intel core i5-4460 CPU with 16GiB RAM and a Nvidia GTX 970 with 4 GiB RAM using Theano 0.8.2, Tensorflow 0.11.0, CUDA 8.0 on Linux Mint 18

His results for addition shows that: enter image description here

He also tested a few other functions such as matrix multiplication:

enter image description here

The results are:

It is clear that the main strengths of Theano and TensorFlow are very fast dot products and matrix exponents. The dot product is approximately 8 and 7 times faster respectively with Theano/Tensorflow compared to NumPy for the largest matrices. Strangely, matrix addition is slow with the GPU libraries and NumPy is the fastest in these tests.

The minimum and mean of matrices are slow in Theano and quick in Tensorflow. It is not clear why Theano is as slow (worse than NumPy) for these operations.

Upvotes: 1

Related Questions