Reputation: 329
I need to iterate over df rows based on its index. I need to find the max in the column p1 and fill it in the output dataframe (along with the max p1), the same for the column p2. In each range of my row indexes (sub_1_ica_1---> sub_1_ica_n), there must be only one 1 and one 2 and I need to fill the remaining with zeros. That's why I need to do the operation range by range.
I tried to split the index name and make a counter for each subject to be used to iterate over the rows, but I feel that I am in the wrong way!
from collections import Counter
a = df.id.tolist()
indlist = []
for x in a:
i = x.split('_')
b = int(i[1])
indlist.insert(-1,b)
c=Counter(indlist)
keyInd = c.keys()
Any ideas?
EDIT: according to Jerazel example my desired output would look like this. First I find the max for p1 and p2 columns which will be translated in the new df into 1 and 2, and the remaining fields will be zeros
Upvotes: 1
Views: 1705
Reputation: 862661
I think you need numpy.argmax
with max
, also if need columns names use idxmax
:
idx = ['sub_1_ICA_0','sub_1_ICA_1','sub_1_ICA_2','sub_2_ICA_0','sub_2_ICA_1','sub_2_ICA_2']
df = pd.DataFrame({'p0':[7,8,9,4,2,3],
'p1':[1,3,5,7,1,0],
'p2':[5,9,6,1,2,4]}, index=idx)
print (df)
cols = ['p0','p1','p2']
df['a'] = df[cols].values.argmax(axis=1)
df['b'] = df[cols].max(axis=1)
df['c'] = df[cols].idxmax(axis=1)
print (df)
p0 p1 p2 a b c
sub_1_ICA_0 7 1 5 0 7 p0
sub_1_ICA_1 8 3 9 2 9 p2
sub_1_ICA_2 9 5 6 0 9 p0
sub_2_ICA_0 4 7 1 1 7 p1
sub_2_ICA_1 2 1 2 0 2 p0
sub_2_ICA_2 3 0 4 2 4 p2
Upvotes: 2