How to apply minmax scaler according to different dataframe

Question

i have a dataframe as below:

import pandas as pd

df = pd.DataFrame({

'category': ['fruits','fruits','fruits','fruits','fruits','vegetables','vegetables','vegetables','vegetables','vegetables'],
'product' : ['apple','orange','durian','coconut','grape','cabbage','carrot','spinach','grass','potato'],
'sales'   : [10,20,30,40,100,10,30,50,60,100]

})

df.head(15)

current method: normalize according to a single category in df, manually

from sklearn import preprocessing
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()

df_fruits = df[df['category'] == "fruits"]
df_fruits['sales'] = scaler.fit_transform(df_fruits[['sales']])
df_fruits.head()
df_fruits = pd.to_csv('minmax/output/category-{}-minmax.csv'.format('XX'))

questions:
- how to loop through accordingly to all the category in df
- then how to export the csv file accordingly with category name in it

thanks a lot

Henry Yik · Accepted Answer

Use Series.unique:

for i in df["category"].unique():
    cat = df[df['category'] == i]
    cat['sales'] = scaler.fit_transform(cat[['sales']])
    cat.to_csv('minmax/output/category-{}-minmax.csv'.format(i))

How to apply minmax scaler according to different dataframe

Answers (2)

Related Questions