Reputation: 117
How do I dynamically slice each row given a starting and ending index without using a for loop. I can do it with loop listed below, but it is way too slow for something where the x.shape[0] > 1 mill
x= np.arange(0,100)
x = x.reshape(20,5)
s_idx = np.random.randint(0,3,x.shape[0])
e_idx = np.random.randint(3,6,x.shape[0])
print(s_idx)
>>> array([2, 1, 2, ..., 1, 0, 2])
print(e_idx)
>>> array([3, 4, 5, ..., 3, 3, 3])
print(x)
>>> array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
...,
[85, 86, 87, 88, 89],
[90, 91, 92, 93, 94],
[95, 96, 97, 98, 99]])
x_indexed = []
for idx,value in enumerate(s_idx):
x_indexed.append(x[idx][s_idx[idx]:e_idx[idx]])
print(x_indexed)
>>> [array([2]),
array([6, 7, 8]),
array([12, 13, 14]),
array([15, 16, 17]),
array([20, 21, 22, 23]),
array([26, 27, 28, 29]),
array([30, 31, 32, 33]),
array([35, 36, 37, 38, 39]),
array([40, 41, 42]),
array([46, 47, 48]),
array([52, 53, 54]),
array([56, 57]),
array([62, 63, 64]),
array([67]),
array([70, 71, 72, 73]),
array([77]),
array([80, 81, 82, 83, 84]),
array([86, 87]),
array([90, 91, 92]),
array([97])]
Upvotes: 2
Views: 551
Reputation: 59701
You can work with masked arrays:
import numpy as np
np.random.seed(100)
x = np.arange(0, 100)
x = x.reshape(20, 5)
s_idx = np.random.randint(0, 3, x.shape[0])
e_idx = np.random.randint(3, 6, x.shape[0])
# This is optional, reduce x to the minimum possible block
first_col, last_col = s_idx.min(), e_idx.max()
x = x[:, first_col:last_col]
s_idx -= first_col
e_idx -= first_col
col_idx = np.arange(x.shape[1])
# Mask elements out of range
mask = (col_idx < s_idx[:, np.newaxis]) | (col_idx >= e_idx[:, np.newaxis])
x_masked = np.ma.array(x, mask=mask)
print(x_masked)
Output:
[[0 1 2 3 --]
[5 6 7 8 9]
[10 11 12 13 14]
[-- -- 17 -- --]
[-- -- 22 -- --]
[25 26 27 28 --]
[-- -- 32 33 --]
[-- 36 37 38 --]
[-- -- 42 -- --]
[-- -- 47 -- --]
[-- -- 52 53 --]
[-- -- 57 58 --]
[-- 61 62 63 --]
[65 66 67 68 69]
[70 71 72 -- --]
[75 76 77 78 79]
[80 81 82 83 --]
[-- -- 87 88 --]
[90 91 92 93 94]
[-- 96 97 98 99]]
You can do most NumPy operations with a masked array, but if you still want the list of arrays you could do something like:
list_arrays = [row[~m] for row, m in zip(x, x_masked.mask)]
print(list_arrays)
Output:
[array([0, 1, 2, 3]),
array([5, 6, 7, 8, 9]),
array([10, 11, 12, 13, 14]),
array([17]),
array([22]),
array([25, 26, 27, 28]),
array([32, 33]),
array([36, 37, 38]),
array([42]),
array([47]),
array([52, 53]),
array([57, 58]),
array([61, 62, 63]),
array([65, 66, 67, 68, 69]),
array([70, 71, 72]),
array([75, 76, 77, 78, 79]),
array([80, 81, 82, 83]),
array([87, 88]),
array([90, 91, 92, 93, 94]),
array([96, 97, 98, 99])]
Although in this case obviously you do not need to construct the intermediate masked array, you can just iterate through the rows of x
and mask
.
Upvotes: 2