Reputation: 43
How can I compare against median of each column in a pandas dataframe and result in true if the value is greater than median and false if value is less than median?
Right now I am standardizing, so basically comparing to 0 ( the mean ) of each column. Want a way to do the same for median.
Upvotes: 1
Views: 1280
Reputation: 3739
What I have understand from your question is you want to compare each column value from its column median
df = pd.DataFrame(data={'a':[1,2,3,4,4,5],
'b':[1,2,3,3,3,3]})
# median of col a and col b is calculated and save in another column
df['median_a'] = df['a'].median()
df['median_b'] = df['b'].median()
# if col a value is greater than median_a then a_bool contains True else False
df['a_bool'] = df.apply(lambda x: True if x['a']>x['median_a'] else False ,axis=1)
df['b_bool'] = df.apply(lambda x: True if x['b']>x['median_b'] else False,axis=1)
I hope it would solve your problem
Upvotes: 3