Ben
Ben

Reputation: 329

iterate over Dataframe row by index value and find max

I need to iterate over df rows based on its index. I need to find the max in the column p1 and fill it in the output dataframe (along with the max p1), the same for the column p2. In each range of my row indexes (sub_1_ica_1---> sub_1_ica_n), there must be only one 1 and one 2 and I need to fill the remaining with zeros. That's why I need to do the operation range by range.

enter image description here

I tried to split the index name and make a counter for each subject to be used to iterate over the rows, but I feel that I am in the wrong way!

    from collections import Counter
    a = df.id.tolist()
    indlist = []
    for x in a:
    i = x.split('_')
    b = int(i[1])
    indlist.insert(-1,b)
    c=Counter(indlist)
    keyInd = c.keys()

Any ideas?

EDIT: according to Jerazel example my desired output would look like this. First I find the max for p1 and p2 columns which will be translated in the new df into 1 and 2, and the remaining fields will be zeros

enter image description here

Upvotes: 1

Views: 1705

Answers (1)

jezrael
jezrael

Reputation: 862661

I think you need numpy.argmax with max, also if need columns names use idxmax:

idx = ['sub_1_ICA_0','sub_1_ICA_1','sub_1_ICA_2','sub_2_ICA_0','sub_2_ICA_1','sub_2_ICA_2']
df = pd.DataFrame({'p0':[7,8,9,4,2,3],
                   'p1':[1,3,5,7,1,0],
                   'p2':[5,9,6,1,2,4]}, index=idx)

print (df)

cols = ['p0','p1','p2']
df['a'] = df[cols].values.argmax(axis=1)
df['b'] = df[cols].max(axis=1)
df['c'] = df[cols].idxmax(axis=1)
print (df)
             p0  p1  p2  a  b   c
sub_1_ICA_0   7   1   5  0  7  p0
sub_1_ICA_1   8   3   9  2  9  p2
sub_1_ICA_2   9   5   6  0  9  p0
sub_2_ICA_0   4   7   1  1  7  p1
sub_2_ICA_1   2   1   2  0  2  p0
sub_2_ICA_2   3   0   4  2  4  p2

Upvotes: 2

Related Questions