m_power
m_power

Reputation: 3204

How to create a 2D array of repeated lower triangular arrays

I have a rectangular 2D array on which I want to apply a 2D indexing array (e.g. arr[indexing_array]).

import numpy as np
import pandas as pd
np.random.seed(1234)
arr = np.random.rand(4,9)

[[0.19 0.62 0.44 0.79 0.78 0.27 0.28 0.8  0.96]
 [0.88 0.36 0.5  0.68 0.71 0.37 0.56 0.5  0.01]
 [0.77 0.88 0.36 0.62 0.08 0.37 0.93 0.65 0.4 ]
 [0.79 0.32 0.57 0.87 0.44 0.8  0.14 0.7  0.7 ]]

I want the 2D indexing array to be a repeated lower triangular, something similar to this for the array arr:

[[False False False False False False False False False]
 [ True False False  True False False  True False False]
 [ True  True False  True  True False  True  True False]
 [ True  True  True  True  True  True  True  True  True]]

Right now I'm creating this index with the following command:

nb_rep = 3 # The number of times the lower triangular array is repeated
k = 0 # An offset for the diagonal
np.arange(arr.shape[0])[:, None] + k > np.tile(np.arange(arr.shape[1]-6), nb_rep)

I tried a solution with np.tril and np.tril_indices functions, but it was quite slower than this solution. Is there a way to simplify this (I'm really not sure about my implementation on the right side of the >)? I used np.tile, but from what I found it might not be the fastest for replicating arrays.

Upvotes: 0

Views: 68

Answers (1)

yann ziselman
yann ziselman

Reputation: 2002

I don't know if my method is the most efficient but it seems to run faster than your code.
Your code:

import numpy as np
import pandas as pd
np.random.seed(1234)
arr = np.random.rand(4,9)

nb_rep = 3 # The number of times the lower triangular array is repeated
k = 0 # An offset for the diagonal
%timeit np.arange(arr.shape[0])[:, None] + k > np.tile(np.arange((arr.shape[1]-6)), nb_rep)

output:

The slowest run took 10.90 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 5: 12.9 µs per loop

My method:

%timeit np.arange(arr.shape[0])[:, None] + k > ((np.arange(6*nb_rep) % arr.shape[0])[None, :])

output:

The slowest run took 15.11 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 5: 6.02 µs per loop

using a much larger array the size of (4000, 9000), the difference is even more significant.
Output of testing your code:

100 loops, best of 5: 46.8 ms per loop

Output of testing my code:

100 loops, best of 5: 133 µs per loop

Upvotes: 1

Related Questions