In TensorFlow there are often many ways to do a thing. For example one to do x+=b one can do a assign_add , or one could do a add and a assign There are similar other examples where 2 ops can do the job of one Concat + ExpandDims vs Stack scatter_nd_update multiple times, vs scatter_nd_update all at once by seperately precomputing the index combinations you want to update. add_n vs n add s Are the single operations fundamentally faster/better? Or are they their for convenience? Does using XLA JIT change this? (Motivation is in defining the overloads in the Julia binding )

optimizationtensorflow

Frames Catherine White

Reputation: 28222

Is it better to use a single operation, than multiple?

In TensorFlow there are often many ways to do a thing.

For example one to do x+=b one can do a assign_add, or one could do a add and a assign

There are similar other examples where 2 ops can do the job of one

Concat + ExpandDims vs Stack
scatter_nd_update multiple times, vs scatter_nd_update all at once by seperately precomputing the index combinations you want to update.
add_n vs n adds

Are the single operations fundamentally faster/better? Or are they their for convenience?

Does using XLA JIT change this?

(Motivation is in defining the overloads in the Julia binding)

Upvotes: 0

Answers (2)

Alexandre Passos

Reputation: 5206

In TensorFlow generally a single operation, if available, is more efficient. x = x + b often allocates memory for x + b and then frees it, while x += b has no overhead. Similarly for the many fused kernels in tensorflow, such as those for the softmax losses.

We hope that eventually XLA will get to the point where straightforward code is as efficient as code which minimizes kernels, but that is not always the case as of May 2017.

Upvotes: 1

Jjoseph

Reputation: 214

Its actually very difficult to tell which is operationally more efficient.

For an example of which is better...

a = a + b vs a += b

if a given language was written in the assembly code IE mips. the add function of mips is generally of

add save_ref, value_ref1, value_ref2

so a compiler would write both the given operations as

add a, a, b  **or** add a, b, a

which are identical. To figure out which is operationally more efficient in TF you would either have to look at the documentation or source code and hope that it elaborates upon the O(n) times.

Calling the single scatter_nd_update might be marginally faster as you might save space on the stack as calling a method has some very small marginal cost. Most likely the cost is negligible.

Upvotes: 0

Is it better to use a single operation, than multiple?

Answers (2)

Related Questions