sachinruk
sachinruk

Reputation: 9869

Get pandas data by multiple indices

Suppose I have the following data:

np.random.seed(42)
data = pd.DataFrame(np.random.randn(3,3))

And I wish to grab the values indexed by the rows and columns:

rows = [0, 1, 2]
cols = [2, 1, 1]

However, doing data.loc[rows, cols] gives me the dataframe:

        2           1          1
0   0.647689    -0.138264   -0.138264
1   -0.234137   -0.234153   -0.234153
2   -0.469474   0.767435    0.767435

Whereas the behaviour that I'm really after is the following:

[data.loc[r, c] for r,c in zip(rows, cols)]
>>> [0.6476885381006925, -0.23415337472333597, 0.7674347291529088]

What is the correct/ efficient way of doing this in pandas?

Upvotes: 1

Views: 63

Answers (2)

seeeheng
seeeheng

Reputation: 11

You're looking for a pandas in-built function, iat.

Running the exact same code you had, and substituting data.loc with data.iat will generate the output you needed.

np.random.seed(42)
data = pd.DataFrame(np.random.randn(3,3))

rows = [0, 1, 2]
cols = [2, 1, 1]

[data.iat[r, c] for r,c in zip(rows,cols)]

>>> [0.6476885381006925, -0.23415337472333597, 0.7674347291529088]

Upvotes: 1

jezrael
jezrael

Reputation: 862406

Convert DataFrame to 2d numpy array and use fancy indexing:

#pandas 0.24+
arr = data.to_numpy()[rows, cols]
#pandas below
arr = data.values[rows, cols]
print (arr)
[ 0.64768854 -0.23415337  0.76743473]

Upvotes: 2

Related Questions