Reputation: 11
I am in the process of deploying a machine learning model for study purposes and I have some questions about it:
preprocessor = pipeline.named_steps["columntransformer"]
model = pipeline.named_steps["xgbclassifier"]
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-29-f928ce436ece> in <cell line: 15>()
13
14 # preprocessor.fit(df[["tenure", "OnlineSecurity", "TechSupport", "Contract"]])
16 print(preprocessed_df)
17
17 frames
/usr/local/lib/python3.10/dist-packages/pandas/core/indexes/base.py in _raise_if_missing(self, key, indexer, axis_name)
5939
5940 not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique())
-> 5941 raise KeyError(f"{not_found} not in index")
5942
5943 @overload
KeyError: "['MonthlyCharges', 'TotalCharges'] not in index"
kbest = final_estimator2.named_steps["selectkbest"].get_support(indices=True)
used_df = transformed_df_columns.iloc[:, kbest]
Is there a step I'm forgetting?
I did a double check in all the code and official documentations.
I'm expecting to understand why my preprocess is asking for two features that "in theory" wasn't used and selected by the KBest during the training fase.
Upvotes: 1
Views: 32