Reputation: 705

numpy indexing with an array of index (to be broadcasted)

Let A be an array of shape (n_0, n_1, …, n_p-2, n_p-1) with arbitrary p.

Let ind be and array of shape (m_0, m_1, …, m_p-2, m_p-1), with, for all k:

either m_k == n_k
or m_k == 1.

The values of ind are indices corresponding to the p-th axis of A, namely integers in {0, 1, …, p-1}.

I want to get an array B of shape (n_0, n_1, …, n_p-1) such that B[i_0, i_1, …, i_p-2, i_p-1] equals A[i_0, i_1, …, i_p-2, ind[j_0, j_1, …, j_p-2, j_p-1] with, for all k:

j_k == i_k if m_k == n_k,
or j_k == 0 if m_k == 1.

Upvotes: 2

Answers (2)

druskacik

Reputation: 2497

This is the fastest solution I could come up with after some testing. The idea is: broadcast ind matrix so that it has the same shape as A, then use the fact that the ind matrix has the same values accros axes which were expanded by broadcasting (i. e. there's no need to iterate over these axes).

If the shape of ind matrix is the same as the shape of A my function works actually slower, but with one or more dimensions of ind set to 1, this function works substantially faster.

Note that this is "logical" optimalization. I believe it's possible to make it faster by maybe using some other numpy functions, different handling of list and tuples etc.

The function:

import numpy as np
import time
from itertools import product

def time_shift_opt(signal, index_shifted):

  index_broadcasted = np.broadcast_to(index_shifted, signal.shape)
  out = np.empty(np.shape(signal))

  indices = product(*[range(d) for d in index_shifted.shape])

  dims_to_expand = [i for i, val in enumerate(index_shifted.shape) if val == 1]

  for index in indices:
    index_to_change = list(index)
    for d in dims_to_expand:
      index_to_change[d] = slice(None)

    index_to_change = tuple(index_to_change)

    val_index = list(index_to_change)
    val_index[-1] = index_broadcasted[index]
    value = signal[tuple(val_index)]

    if index_shifted.shape[-1] == 1:
      value = value[..., np.newaxis]

    out[index_to_change] = value

  return out

Comparing times: (See that with the growing number of ind dimensions set to 1 the function I proposed runs much faster)

def test_time_optimality(signal_shape, ind_shape, n_iterations = 10):
  p = signal_shape[-1]
  A = np.random.randint(1000, size = size)
  ind = np.random.randint(p, size = ind_size)

  averages = []

  temp_average = 0
  for i in range(n_iterations):
    start = time.time()
    res = time_shift(A, ind)
    temp_average += time.time() - start

  averages.append(temp_average/n_iterations)

  temp_average = 0
  for i in range(n_iterations):
    start = time.time()
    res = time_shift_opt(A, ind)
    temp_average += time.time() - start

  averages.append(temp_average/n_iterations)

  print('time_shift average duration: {:.4f}'.format(averages[0]))
  print('time_shift_opt average duration: {:.4f}'.format(averages[1]))

p = 3
size = (10, 20, 12, 50, p)
ind_size = (10, 20, 12, 50, p)
test_time_optimality(size, ind_size)
# time_shift average duration: 0.3734
# time_shift_opt average duration: 0.5365

size = (10, 20, 12, 50, p)
ind_size = (10, 1, 12, 50, p)
test_time_optimality(size, ind_size)
# time_shift average duration: 0.3697
# time_shift_opt average duration: 0.0414

size = (10, 20, 12, 50, p)
ind_size = (1, 1, 12, 50, p)
test_time_optimality(size, ind_size)
# time_shift average duration: 0.3840
# time_shift_opt average duration: 0.0080

size = (10, 20, 12, 50, p)
ind_size = (1, 1, 1, 1, 1)
test_time_optimality(size, ind_size)

# time_shift average duration: 0.3775
# time_shift_opt average duration: 0.0018

Upvotes: 1

M. Toya

Reputation: 705

I ended writing the following (not thoroughly tested) function. I suppose (hope) better performances are achievable

def time_shift(signal, index_shifted):
    """Time shift an array of signal with shifted (time) indices.

    Time is indexed by the last dimension.

    Args:
       signal (array): original signals.
       index_shifted (array): shifted indices, with same shape as signal, or
           possibly 1 along dimension to be broadcasted.

    Returns:
        Array with shifted values.
    """

    out = np.empty(np.shape(signal))
    it = np.nditer([index_shifted, out], flags=['multi_index'],
                   op_flags = [['readonly'],
                               ['writeonly', 'allocate', 'no_broadcast']])
    with it:
        for ind, zz in it:
            selection = (*it.multi_index[:-1], ind)
            zz[...] = signal[selection]
    return out

Upvotes: 1

numpy indexing with an array of index (to be broadcasted)

Answers (2)

Related Questions