user1639926
user1639926

Reputation: 862

How to get specific index of np.array of np.arrays fast

At the most basic I have the following dataframe:

a = {'possibility' : np.array([1,2,3])}
b = {'possibility' : np.array([4,5,6])}

df = pd.DataFrame([a,b])

This gives me a dataframe of size 2x1: like so:

row 1:  np.array([1,2,3])
row 2:  np.array([4,5,6])

I have another vector of length 2. Like so:

[1,2]

These represent the index I want from each row.
So if I have [1,2] I want: from row 1: 2, and from row 2: 6. Ideally, my output is [2,6] in a vector form, of length 2.

Is this possible? I can easily run through a for loop, but am looking for FAST approaches, ideally vectors approaches since it is already in pandas/numpy.

For actual use case approximations, I am looking to make this work in the 300k-400k row ranges. And need to run it in optimization problems (hence the fast part)

Upvotes: 1

Views: 44

Answers (1)

mozway
mozway

Reputation: 260480

You could transform to a multi-dimensional numpy array and take_along_axis:

v = np.array([1,2])
a = np.vstack(df['possibility'])
np.take_along_axis(a.T, v[None], axis=0)[0]

output: array([2, 6])

Upvotes: 1

Related Questions