Reputation: 13
I'm trying to implement entropy method. So, I need to use minmaxscaler on a dataframe which convert from a dict. However, because I don't know how many parameters will be used in the values of the dict, I couldn't name all the rows while using DataFrame.from_dict(). As a result, while I used the MinMaxScaler, the DataFrame turns into a ndarray, and the index of the dict disappears. How could I keep the index?
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
dict = {'A':[89, 11], 'B':[80, 96], 'C':[97, 89], 'D':[90, 24], 'E': [100, 90]}
df = pd.DataFrame.from_dict(dict, orient='index')
print(df)
scaler = MinMaxScaler()
df = scaler.fit_transform(df[0:])
print(df)
yij = df.apply(lambda x: x / x.sum(), axis=0) # An error occurs here: AttributeError: 'numpy.ndarray' object has no attribute 'apply'
K = 1/np.log(len(df))
tmp = yij*np.log(yij)
tmp=np.nan_to_num(tmp)
ej = -K*(tmp.sum(axis=0))
wj = (1 - ej) / np.sum(1 - ej)
score = yij.apply(lambda x: np.sum(100 * x * wj), axis=1)
print(score)
Upvotes: 0
Views: 2220
Reputation: 5304
You can use:
df[:] = scaler.fit_transform(df)
which will replace all values by their transformed values. This operation also keep the rows and columns indexes.
Upvotes: 3