Reputation: 3528
Assuming we have a df
as follows:
id A B
50 1 5
60 2 6
70 3 7
80 4 8
I would like to know as to how can normalize just the column B
, between 0 and 1, while keeping the other columns id
and column A
completely unaffected?
Edit 1: If I do the following
import pandas as pd
df = pd.DataFrame({ 'id' : ['50', '60', '70', '80'],
'A' : ['1', '2', '3', '4'],
'B' : ['5', '6', '7', '8']
})
from sklearn import preprocessing
min_max_scaler = preprocessing.MinMaxScaler()
X_minmax = min_max_scaler.fit_transform(df.values[:,[2]])
I get the X_minmax
as follows
0
0.333333
0.666667
1
I want these 4 values to be placed in place of the column B
in the dataframe df
without changing the other 2 columns looking as below:
id A B
50 1 0
60 2 0.333333
70 3 0.666667
80 4 1
Upvotes: 1
Views: 1437
Reputation: 1373
You might want to do something like this.
import sklearn.preprocessing as preprocessing
df=pd.DataFrame({'id':[50,60,70,80],'A':[1,2,3,4],'B':[5,6,7,8]})
float_array = df['B'].values.astype(float).reshape(-1,1)
min_max_scaler = preprocessing.MinMaxScaler()
scaled_array = min_max_scaler.fit_transform(float_array)
df['B']=scaled_array
Upvotes: 1
Reputation: 2260
You can reassign the value of the column:
df.B = (df.B - df.B.mean()) / (df.B.max() - df.B.min())
Upvotes: 2