Reputation: 2978
I try to adapt to Koalas the code that runs well with Pandas:
import pandas as pd
from databricks import koalas as ks
from sklearn import preprocessing
pdf = pd.DataFrame({'x':range(3), 'y':[1,2,5], 'z':[100,200,1000]})
df = ks.from_pandas(pdf)
min_max_scaler = preprocessing.MinMaxScaler()
result = min_max_scaler.fit_transform(df)
It fails at the last line with the following error:
ValueError: could not convert string to float: 'x'
It seems that the header line in Koalas is interpreted as a normal row by fit_transform
function.
Is there any workaround?
Thanks.
Upvotes: 2
Views: 921
Reputation: 14911
You will get a little bit further with changing
df = ks.from_pandas(pdf)
to
df = ks.from_pandas(pdf).set_index('x')
to explicitly make x
the index column in the pandas and koalas dataframes.
Upvotes: 1