JohnnieL
JohnnieL

Reputation: 1231

pandas mask change value where condition true and false

I have a matrix of values. I want to rank the values in the columns and then set top ranked values to 1 and others to zero.

I have tried to do this using nlargest, head but the only solution I can figure out is to apply mask twice.

My solution is below, but is there a smarter way to do this?

many thanks

John

import pandas as pd
df = pd.DataFrame([(1, 2, 3),
                   (4, 5, 6),
                   (7, 8, 9),
                   (11, 21, 31),
                   (41, 51, 31),
                   (71, 51, 61),
                   (71, 81, 91)],
                  columns=('value_1','value_2','value_3'))

value_1 value_2 value_3
0 1 2 3
1 4 5 6
2 7 8 9
3 11 21 31
4 41 51 31
5 71 51 61
6 71 81 91
N = 3 # arbitrary cut off
df = df.rank(ascending=False, axis=0, method='min')
df.mask(df > N, 0, inplace=True)
df.mask(df > 0, 1, inplace=True) # i.e. values not previously masked

Resulting df

value_1 value_2 value_3
0 0 0 0
1 0 0 0
2 0 0 0
3 0 0 1
4 1 1 1
5 1 1 1
6 1 1 1

Upvotes: 1

Views: 905

Answers (1)

It_is_Chris
It_is_Chris

Reputation: 14063

Try creating the boolean values and then use astype

(~(df.rank(ascending=False, axis=0, method='min') > N)).astype(int)

   value_1  value_2  value_3
0        0        0        0
1        0        0        0
2        0        0        0
3        0        0        1
4        1        1        1
5        1        1        1
6        1        1        1

Upvotes: 1

Related Questions