Reputation: 33
I want to find a way to avoid the loop in my code. I need to implement the following formula, which is at first straightforward:
In words: A list of indices is parsed called I. For every index specified in I, the values of all following indices in the array x need to be subtracted. Do some calculations on the subtracted value. Sum everything up. Done.
My current code:
def loss(x, indices):
"""
Args:
x: array_like, dtype=float
indices: array_like, dtype=int
Example:
>>> x = np.array([0.3, 0.5, 0.2, 0.1, 1.2, 2.4, 2.8, 1.5, 3.2])
>>> indices = np.array([0, 2, 3, 6])
>>> print(loss(x, indices))
21.81621815885847
"""
total = 0.0
for index in indices:
# Broadcasting here, as all values from all following indices have
# to be subtracted from the value at the given i index.
difference = x[index] - x[index + 1:]
# Sum all up
log_addition = 1.0 + np.log(np.abs(difference))
total += np.sum(log_addition)
return total
The challenging part is that the 'i' indices are spread randomly over the range of the output. Any ideas?
Upvotes: 3
Views: 101
Reputation: 221544
Here's one with NumPy-based vectorization -
mask = indices[:,None] < np.arange(len(x))
v = x[indices,None] - x
vmasked = v[mask]
log_addition = np.log(np.abs(vmasked))
out = log_addition.sum() + mask.sum()
Alternatively, using the laws of log, we can replace the final two steps with -
out = np.log(np.prod(np.abs(vmasked))).sum() + mask.sum()
Pushing the abs
out, so that it operates on a scalar instead, it would be -
out = np.log(np.abs(np.prod(vmasked))).sum() + mask.sum()
Again, we can leverage multi-cores
with numexpr
-
import numexpr as ne
out = np.log(np.abs(ne.evaluate('prod(vmasked)'))) + mask.sum()
If you find that even v
has too many unwanted elements, we can directly go to vmasked
with -
xi = x[indices]
x2D = np.broadcast_to(x, (len(indices),len(x)))
vmasked = np.repeat(xi,mask.sum(1))-x2D[mask]
Upvotes: 4