Reputation: 21
from sklearn.svm import SVC
from sklearn import metrics
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import AdaBoostClassifier
from sklearn.preprocessing import StandardScaler
import sys
#import warnings
#warnings.filterwarnings("ignore")
df = pd.read_csv("data.csv")
df.isna().sum()
df = df.fillna(0)
X = df[['salary', 'to_messages', 'deferral_payments', 'total_payments', 'exercised_stock_options', 'bonus', 'restricted_stock', 'shared_receipt_with_poi', 'restricted_stock_deferred', 'total_stock_value', 'expenses', 'loan_advances', 'from_messages', 'other', 'from_this_person_to_poi', 'director_fees', 'deferred_income', 'long_term_incentive', 'from_poi_to_this_person']]
y = df[['poi']]
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size = 0.35,random_state = 42)
clf_SVM = SVC(probability=False, kernel='linear')
clf_adaboost_boostAcc = AdaBoostClassifier(n_estimators=20, algorithm='SAMME.R',base_estimator=clf_SVM,learning_rate=1)
model_adaboost_boostAcc = clf_adaboost_boostAcc.fit(X_train, y_train.values.ravel())
the problem line in the last line, I try to do adaboost but it say:
TypeError: AdaBoostClassifier with algorithm='SAMME.R' requires that the weak learner supports the calculation of class probabilities with a predict_proba method. Please change the base estimator or set algorithm='SAMME' instead.
What do I need to do to fix this bug? I do following this guide: adaboost+svc +if possible, is there a way to decrease time of training with svc? (I already try with StandardScaler and OneVsRestClassifier. It still take 4 hours with no result come out)
Upvotes: 0
Views: 295
Reputation:
In the AdaBoostClassifier document, it is said "estimator must support calculation of class probabilities."
So, we change the probability
parameter in clf_SVM
.
clf_SVM = SVC(probability=True, kernel='linear')
You can read some of the answers on the links below about how to decrease time of SVC training.
Making SVM run faster in python - scikit learn
SVM using scikit learn runs endlessly
Upvotes: 1