k92
k92

Reputation: 385

Is it possible to change pandas column data type within a sklearn pipeline?

Sklearn pipeline I am using has multiple transformers but one of the initial transformers returns numerical type and the consecutive one takes object type variables.

Basically I need squeeze in a:

data[col] = data[col].astype(object)

for the required columns within the pipeline.

Is there any way to do it?

Note: I am using Feature-engine transformers.

Upvotes: 4

Views: 4261

Answers (1)

thushv89
thushv89

Reputation: 11343

Yes, you can use a sklearn.preprocessing.FunctionTransformer. A simple example would be,

def to_object(x):
  return pd.DataFrame(x).astype(object)

fun_tr = FunctionTransformer(to_object)

y = fun_tr.fit_transform(pd.DataFrame({'a':[1,2,3]}))

Upvotes: 13

Related Questions