Shahalan
Shahalan

Reputation: 21

AdaBoostClassifier with algorithm='SAMME.R' requires. But I already add algorithm='SAMME.R'

from sklearn.svm import SVC
from sklearn import metrics
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import AdaBoostClassifier

from sklearn.preprocessing import StandardScaler
import sys
#import warnings
#warnings.filterwarnings("ignore")

df = pd.read_csv("data.csv")
df.isna().sum()
df = df.fillna(0)

X = df[['salary', 'to_messages', 'deferral_payments', 'total_payments', 'exercised_stock_options', 'bonus', 'restricted_stock', 'shared_receipt_with_poi', 'restricted_stock_deferred', 'total_stock_value', 'expenses', 'loan_advances', 'from_messages', 'other', 'from_this_person_to_poi', 'director_fees', 'deferred_income', 'long_term_incentive', 'from_poi_to_this_person']]
y = df[['poi']]

X_train,X_test,y_train,y_test = train_test_split(X,y,test_size = 0.35,random_state = 42)

clf_SVM = SVC(probability=False, kernel='linear')
clf_adaboost_boostAcc = AdaBoostClassifier(n_estimators=20, algorithm='SAMME.R',base_estimator=clf_SVM,learning_rate=1)

model_adaboost_boostAcc = clf_adaboost_boostAcc.fit(X_train, y_train.values.ravel())

the problem line in the last line, I try to do adaboost but it say:

TypeError: AdaBoostClassifier with algorithm='SAMME.R' requires that the weak learner supports the calculation of class probabilities with a predict_proba method. Please change the base estimator or set algorithm='SAMME' instead.

What do I need to do to fix this bug? I do following this guide: adaboost+svc +if possible, is there a way to decrease time of training with svc? (I already try with StandardScaler and OneVsRestClassifier. It still take 4 hours with no result come out)

Upvotes: 0

Views: 295

Answers (1)

user20109839
user20109839

Reputation:

In the AdaBoostClassifier document, it is said "estimator must support calculation of class probabilities."

So, we change the probability parameter in clf_SVM.

clf_SVM = SVC(probability=True, kernel='linear')

You can read some of the answers on the links below about how to decrease time of SVC training.
Making SVM run faster in python - scikit learn
SVM using scikit learn runs endlessly

Upvotes: 1

Related Questions