Shawn Brar
Shawn Brar

Reputation: 1420

Leave one out encoding with Dask dataframes

I have the following dask dataframe:-

import dask.dataframe as dd
import pandas as pd

df = pd.DataFrame({"A": [1, 2, 1, 2, 3, 1, 2, 3, 5], "B": ["a", "b", "c", "c", "a", "b", "b", "a", "c"], "y":[0, 10, 2, 1, 4, 1, 6, 12, 11]})
X = dd.from_pandas(df, npartitions=2)

In the dataframe X, column B has the categories that I want encode, and column y are the y values. This is just an example dataframe. In reality, my dataset has more than 1000 categories.

How can I do leave-one-out encoding on a dask dataframe?

Thanks in advance.

Upvotes: 1

Views: 77

Answers (0)

Related Questions