Reputation: 77
I have a dataframe like this:
test = pd.DataFrame({"id":[1,2,3,4],
"name_1":["peter","bobby","alex","chris"],
"name_1_flag":["real","fake","fake","real"],
"name_2":["hector","abi","henrik","miko"],
"name_2_flag":["fake","real","fake","fake"],
"name_3":["hans","khan","will","than"],
"name_3_flag":["fake","fake","real","fake"]})
id name_1 name_1_flag name_2 name_2_flag name_3 name_3_flag
0 1 peter real hector fake hans fake
1 2 bobby fake abi real khan fake
2 3 alex fake henrik fake will real
3 4 chris real miko fake than fake
How can I find the row/column tuple that has the word "real" in it.
Optimally the output would be an array or series like this:
col_index
0 3
1 5
2 7
3 3
Upvotes: 4
Views: 361
Reputation: 22503
You can also use dot
:
print (test.eq("real").dot(range(test.columns.size))+1)
0 3
1 5
2 7
3 3
dtype: int32
Upvotes: 1
Reputation: 323366
Let us try with
s = test.where(lambda x : x=='real').stack()
test['new'] = test.columns.get_indexer(s.index.get_level_values(1))+1
test
Out[11]:
id name_1 name_1_flag name_2 name_2_flag name_3 name_3_flag new
0 1 peter real hector fake hans fake 3
1 2 bobby fake abi real khan fake 5
2 3 alex fake henrik fake will real 7
3 4 chris real miko fake than fake 3
Upvotes: 1
Reputation: 71610
Try np.argmax
:
>>> np.argmax(test.eq('real').to_numpy(), axis=1) + 1
array([3, 5, 7, 3], dtype=int64)
>>>
--- Or get_indexer
:
test.columns.get_indexer(test.eq('real').idxmax(axis=1)) + 1
Or .T.reset_index(drop=True)
:
test.T.reset_index(drop=True).eq('real').idxmax() + 1
np.argmax
:
test["col_index"] = np.argmax(test.eq('real').to_numpy(), axis=1) + 1
The one with get_indexer
:
test["col_index"] = test.columns.get_indexer(test.eq('real').idxmax(axis=1)) + 1
.T
:
test["col_index"] = test.T.reset_index(drop=True).eq('real').idxmax() + 1
All Output:
id name_1 name_1_flag name_2 name_2_flag name_3 name_3_flag col_index
0 1 peter real hector fake hans fake 3
1 2 bobby fake abi real khan fake 5
2 3 alex fake henrik fake will real 7
3 4 chris real miko fake than fake 3
Upvotes: 3
Reputation: 61930
Use np.where:
test["col_index"] = np.where(test.eq("real"))[1] + 1
print(test)
Output
id name_1 name_1_flag name_2 name_2_flag name_3 name_3_flag col_index
0 1 peter real hector fake hans fake 3
1 2 bobby fake abi real khan fake 5
2 3 alex fake henrik fake will real 7
3 4 chris real miko fake than fake 3
Upvotes: 4