Reputation: 9869
Suppose I have the following data:
np.random.seed(42)
data = pd.DataFrame(np.random.randn(3,3))
And I wish to grab the values indexed by the rows and columns:
rows = [0, 1, 2]
cols = [2, 1, 1]
However, doing data.loc[rows, cols]
gives me the dataframe:
2 1 1
0 0.647689 -0.138264 -0.138264
1 -0.234137 -0.234153 -0.234153
2 -0.469474 0.767435 0.767435
Whereas the behaviour that I'm really after is the following:
[data.loc[r, c] for r,c in zip(rows, cols)]
>>> [0.6476885381006925, -0.23415337472333597, 0.7674347291529088]
What is the correct/ efficient way of doing this in pandas?
Upvotes: 1
Views: 63
Reputation: 11
You're looking for a pandas in-built function, iat.
Running the exact same code you had, and substituting data.loc
with data.iat
will generate the output you needed.
np.random.seed(42)
data = pd.DataFrame(np.random.randn(3,3))
rows = [0, 1, 2]
cols = [2, 1, 1]
[data.iat[r, c] for r,c in zip(rows,cols)]
>>> [0.6476885381006925, -0.23415337472333597, 0.7674347291529088]
Upvotes: 1
Reputation: 862406
Convert DataFrame to 2d
numpy array and use fancy indexing:
#pandas 0.24+
arr = data.to_numpy()[rows, cols]
#pandas below
arr = data.values[rows, cols]
print (arr)
[ 0.64768854 -0.23415337 0.76743473]
Upvotes: 2