dingaro
dingaro

Reputation: 2342

How to create a few Machine Learning models through all variables and after each iteration next XGBClassifier is created with 1 less var in Python?

I have DataFrame in Python Pandas like below:

Input data:

Requirements: And I need to:

Desire output:

So, as a result I need to have something like below

My draft: which is wrong because it should be loop through all the variables in such a way that after each iteration a new XGBoost classification model is created and also after each iteration one of the variables is discarded and create next model

X_train, X_test, y_train, y_test = train_test_split(df.drop("Y", axis=1)
                                                    , df.Y
                                                    , train_size = 0.70
                                                    , test_size=0.30
                                                    , random_state=1
                                                    , stratify = df.Y)

results = []
list_of_models = []

for val in X_train:

    model = XGBClassifier()
    model.fit(X_train, y_train)
    list_of_models.append(model)

    preds_train = model.predict(X_train)
    preds_test = model.predict(X_test)
    preds_prob_train = model.predict_proba(X_train)[:,1]
    preds_prob_test = model.predict_proba(X_test)[:,1]

    results.append({("AUC_train":round(metrics.roc_auc_score(y_train,preds_prod_test),3),
                     "AUC_test":round(metrics.roc_auc_score(y_test,preds_prod_test),3})

results = pd.DataFrame(results)

How can I do that in Python ?

Upvotes: 0

Views: 360

Answers (1)

MichaelB
MichaelB

Reputation: 55

You want to make your data narrower during each loop? If I understand this correctly you could do something like this:

results = []
list_of_models = []

for i in X_train.columns:
    model = XGBClassifier()
    model.fit(X_train, y_train)
    list_of_models.append(model)

    preds_train = model.predict(X_train)
    preds_test = model.predict(X_test)
    preds_prob_train = model.predict_proba(X_train)[:,1]
    preds_prob_test = model.predict_proba(X_test)[:,1]
    results.append({("AUC_train":round(metrics.roc_auc_score(y_train,preds_prod_test),3),
                 "AUC_test":round(metrics.roc_auc_score(y_test,preds_prod_test),3})
    X_train = X_train.drop(i, axis=1)
    X_test = X_test.drop(i, axis=1)

results = pd.DataFrame(results)

Upvotes: 1

Related Questions