Reputation: 704
My dataframe contains currently the following design
Source:
index col1 col2 col3
row1 100 50 0
row2 -100 50 -25
row3 0 0 0
row4 -1 -1 -1
row5 1 1 1
row6 -100 0 1
My Target is
index col1 col2 col3
row1 1.0 0.5 0.0
row2 0 1 0.5
row3 0 0 0
row4 0 0 0
row5 0 0 0
row6 0 0.99 1
What i did try from Stackoverflow answers:
Normalizes Column max instead of row max/min
df = (df.T / df.T.sum()).T
Normalizes Column max instead of row max/min
df = df.div(df.sum(axis=1), axis=0)
Normalizes Column max instead of row max/min
df.iloc[:,:] = Normalizer(norm='l2').fit_transform(df)
i did try to change:
df.div(df.sum(axis=1), axis=0)
and play with the axis, unfortunately as soon as i change any axis it throws an error.
From reading on the pandas dataframe built in functions i cant see anything pythonic and easy how i achive it without complicated lambda functions on a apply with storing the min max values before on each row. Pandas also says that we should not iterate over rows and change values :-( so i am a bit lost and appreciate some input.
Upvotes: 1
Views: 2019
Reputation: 57033
NaN
s. Fill them with the original values.Code:
df.subtract(df.min(axis=1), axis=0)\
.divide(df.max(axis=1) - df.min(axis=1), axis=0)\
.combine_first(df)
# col1 col2 col3
#row1 1.0 0.500000 0.0
#row2 0.0 1.000000 0.5
#row3 0.0 0.000000 0.0
#row4 -1.0 -1.000000 -1.0
#row5 1.0 1.000000 1.0
#row6 0.0 0.990099 1.0
Upvotes: 3