Reputation: 2651
Below is code with the functionality I want on some simple sample data. Basically I binned data using np.digitize, and then I computed a column index based on this question. bin_idx is known to never decrease in case that helps. How can I index to get the 2D array without an explicit loop? One complication is that the number of values in each row/bin varies. I will later do different statistics on each bin/row, max just being an example.
import numpy as np
x = np.arange(10)
bin_idx = np.array([0, 0, 0, 1, 2, 3, 3, 4, 4, 4])
col_idx = np.array([0, 1, 2, 0, 0, 0, 1, 0, 1, 2])
binned = np.ones((bin_idx[-1]+1, np.max(col_idx)+1)) * np.nan
for i in range(len(x)):
binned[bin_idx[i], col_idx[i]] = x[i]
print(binned)
row_max = np.nanmax(binned, 1)
print(row_max)
Upvotes: 0
Views: 896