Reputation: 21
To improve performance on project of mine, i've coded a function using tf.function to replace a function witch does not use tf. The result is that plain python code runs much (100x faster) than the tf.funtion when GPU is enabled. When running on CPU, TF is still slower, but only 10x slower. Am i missing something?
@tf.function
def test1(cond):
xp = tf.constant(0)
yp = tf.constant(0)
stride = tf.constant(10)
patches = tf.TensorArray(
tf.int32, size=tf.cast((cond / stride + 1) * (cond / stride + 1), dtype=tf.int32), dynamic_size=False, clear_after_read=False)
i = tf.constant(0)
while tf.less_equal(yp, cond):
while tf.less_equal(xp, cond):
xp = tf.add(xp, stride)
patches = patches.write(i, xp)
i += 1
xp = tf.constant(0)
yp = tf.add(yp, stride)
return patches.stack()
def test2(cond):
xp = 0
yp = 0
stride = 10
i = 0
patches = []
while yp <= cond:
while xp <= cond:
xp += stride
patches.append(xp)
xp = 0
yp += stride
return patches
This is specially noticeable when cond is big (like 5000 or greater)
UPDATE:
I have found this and this. As i was expecting, it seems performance of TensorArray is poor, and, the solution, in my case, was to replace TensorArray and loops with other Tensor calculations (in this case, i used tf.image.extract_patches and others). In this way, it achieved performance 3x faster than plain python code.
Upvotes: 2
Views: 1043
Reputation: 991
When you use some tf functions with GPU enabled, it does a callback and transfers data to the GPU. In some cases, this overhead is not worth it. When running on the CPU, this overhead decreases, but it's still slower than pure python code.
Tensorflow is faster when you do heavy calculations, and that's what Tensorflow was made for. Even numpy can be slower than pure python code for light calculations.
Upvotes: 2
Reputation: 575
The slow part (while loop) is still in python and simple functions like this are pretty fast. The linear overhead of switching from python to tf each time is certainly bigger than anything you could ever gain on such a small function. For more complex operations, this might be very different. In this case tf is simply overkill.
Upvotes: 0