Fluxy
Fluxy

Reputation: 2978

Koalas is incompatible with Sklearn - ValueError: could not convert string to float: 'x'

I try to adapt to Koalas the code that runs well with Pandas:

import pandas as pd
from databricks import koalas as ks
from sklearn import preprocessing

pdf = pd.DataFrame({'x':range(3), 'y':[1,2,5], 'z':[100,200,1000]})

df = ks.from_pandas(pdf)

min_max_scaler = preprocessing.MinMaxScaler()
result = min_max_scaler.fit_transform(df)

It fails at the last line with the following error:

ValueError: could not convert string to float: 'x'

It seems that the header line in Koalas is interpreted as a normal row by fit_transform function.

Is there any workaround?

Thanks.

Upvotes: 2

Views: 921

Answers (1)

Tagar
Tagar

Reputation: 14911

You will get a little bit further with changing

df = ks.from_pandas(pdf)

to

df = ks.from_pandas(pdf).set_index('x')

to explicitly make x the index column in the pandas and koalas dataframes.

Upvotes: 1

Related Questions