Reputation: 133
is there a way to normalize the columns of a DataFrame using sklearn's normalize? I think that by default it normalizes rows
For example, if I had df:
A B
1000 10
234 3
500 1.5
I would want to get the following:
A B
1 1
0.234 0.3
0.5 0.15
Upvotes: 2
Views: 3311
Reputation: 59519
sklearn
defaults to normalize rows with the L2
normalization. Both of these arguments need to be changed for your desired normalization by the maximum value along columns:
from sklearn import preprocessing
preprocessing.normalize(df, axis=0, norm='max')
#array([[1. , 1. ],
# [0.234, 0.3 ],
# [0.5 , 0.15 ]])
Upvotes: 1
Reputation: 1440
From the documentation
axis : 0 or 1, optional (1 by default) axis used to normalize the data along. If 1, independently normalize each sample, otherwise (if 0) normalize each feature.
So just change the axis. Having said that, sklearn
is an overkill for this task. It can be achieved easily using pandas.
Upvotes: 0
Reputation: 323226
You can using div
after get the max
df.div(df.max(),1)
Out[456]:
A B
0 1.000 1.00
1 0.234 0.30
2 0.500 0.15
Upvotes: 2
Reputation: 71560
Why do you need sklearn
?
Just use pandas:
>>> df / df.max()
A B
0 1.000 1.00
1 0.234 0.30
2 0.500 0.15
>>>
Upvotes: 3