Reputation: 1139
I have a pandas dataframe, output similar to below:
index value
0 5.95
1 1.49
2 2.34
3 5.79
4 8.48
I want to get the normalised value of each column['value'] and store it in a new column['normalised'] but not sure how to apply the normalise function to the column...
my normalising function would look like this: (['value'] - min['value'])/(max['value'] - min['value']
I know I should probably be using the apply or transform function to add the new column to the dataframe but not sure how to pass the normalising function to the apply function...
Sorry if I'm getting the terminology wrong but I'm a newbe to python and in particular pandas!
Upvotes: 1
Views: 444
Reputation: 76297
These are pretty standard column operations:
>>> (df.value - df.value.min()) / (df.value.max() - df.value.min())
0 0.638054
1 0.000000
2 0.121602
3 0.615165
4 1.000000
Name: value, dtype: float64
You can simply write
df['normalized'] = (df.value - ....
Upvotes: 3
Reputation: 11057
I'd consider user the lambda/apply method, which I'm sure you'll be able to finesse, which requires determining ahead of time the min and max values.
First, write a function that outputs a value, based on some 'global' parameters, and an input value fetched from a data-row.
def norm(vmax, vmin, val):
return (val-vmin)/(vmax-vmin)
Next, collect your global values from the dataframe:
val_min = df['value'].min()
val_max = df['value'].max()
Finally, you can apply the function, creating a new field to hold the result:
df['new_field'] = df.apply(lambda row : norm(val_min,val_max,row['value']),axis=1)
df
value new_field
0 5.95 0.361946
1 1.49 1.000000
2 2.34 0.878398
3 5.79 0.384835
4 8.48 -0.000000
The beauty of using this 'lambda' approach, you can tweak your functions as you like, which (in my opinion anyway) compartmentalise the code better, allowing for reuse - which is always a good thing.
Upvotes: 1
Reputation: 2536
Lets call your DataFrame DF.
DF['normalised'] = (DF['value']-min(DF['value']))/(max(DF['value']-min(DF['value'])
does the trick.
Upvotes: 2