Julian Drago
Julian Drago

Reputation: 749

sklearn normalize() produces every value as 1

I'm trying to normalize a single feature to [0, 1], but the result I'm getting back is all float values of 1 and is clearly wrong.

import pandas as pd
import numpy as np
from sklearn.preprocessing import normalize

test = pd.DataFrame(data=[7, 6, 5, 2, 9, 9, 7, 8, 6, 5], columns=['data'])
normalize(test['data'].values.reshape(-1, 1))

This produces the following output:

array([[1.],
       [1.],
       [1.],
       [1.],
       [1.],
       [1.],
       [1.],
       [1.],
       [1.],
       [1.]])

I thought this might be an int to float datatype issue so I tried casting to float first, normalize(test['data'].astype(float).values.reshape(-1, 1)), but this gives the same result. What am I missing?

Upvotes: 3

Views: 2580

Answers (2)

BENY
BENY

Reputation: 323226

I feel like we can use

(test.data-test.data.min())/np.ptp(test.data.values)
Out[136]: 
0    0.714286
1    0.571429
2    0.428571
3    0.000000
4    1.000000
5    1.000000
6    0.714286
7    0.857143
8    0.571429
9    0.428571
Name: data, dtype: float64

Upvotes: 2

Chris
Chris

Reputation: 29732

This is because the default axis is 1.

Set axis = 0:

normalize(test['data'].values.reshape(-1, 1), axis=0)

Output:

array([[0.32998316],
       [0.28284271],
       [0.23570226],
       [0.0942809 ],
       [0.42426407],
       [0.42426407],
       [0.32998316],
       [0.37712362],
       [0.28284271],
       [0.23570226]])

Upvotes: 7

Related Questions