some_programmer
some_programmer

Reputation: 3528

How to normalize just one column of a dataframe while keeping the others unaffected?

Assuming we have a df as follows:

id A  B
50 1  5
60 2  6
70 3  7
80 4  8

I would like to know as to how can normalize just the column B, between 0 and 1, while keeping the other columns id and column A completely unaffected?

Edit 1: If I do the following

import pandas as pd
df = pd.DataFrame({ 'id' : ['50', '60', '70', '80'],
        'A' : ['1', '2', '3', '4'],
        'B' : ['5', '6', '7', '8']
        })

from sklearn import preprocessing
min_max_scaler = preprocessing.MinMaxScaler()
X_minmax = min_max_scaler.fit_transform(df.values[:,[2]])

I get the X_minmax as follows

0
0.333333
0.666667
1

I want these 4 values to be placed in place of the column B in the dataframe df without changing the other 2 columns looking as below:

    id A  B
    50 1  0
    60 2  0.333333
    70 3  0.666667
    80 4  1

Upvotes: 1

Views: 1437

Answers (2)

Kartikeya Sharma
Kartikeya Sharma

Reputation: 1373

You might want to do something like this.

import sklearn.preprocessing as preprocessing
df=pd.DataFrame({'id':[50,60,70,80],'A':[1,2,3,4],'B':[5,6,7,8]})
float_array = df['B'].values.astype(float).reshape(-1,1)
min_max_scaler = preprocessing.MinMaxScaler()
scaled_array = min_max_scaler.fit_transform(float_array)
df['B']=scaled_array

Upvotes: 1

dzang
dzang

Reputation: 2260

You can reassign the value of the column:

df.B = (df.B - df.B.mean()) / (df.B.max() - df.B.min())

Upvotes: 2

Related Questions