Reputation: 55
I have a dataframe like this
time label
-----------------------
morning good
afternoon good
night bad
night okay
I want to apply onehotencoding for the data to be used in svm crossvalidation. I tried as follows
from sklearn.model_selection import ShuffleSplit
from sklearn.preprocessing import OneHotEncoder
from sklearn.svm import SVC
x = ds_df['time']
y = ds_df['label']
enc = OneHotEncoder()
X_vec = enc.fit_transform(X)
model = SVC(kernel='linear')
cv = ShuffleSplit(n_splits=5, test_size=0.2, random_state=69)
scores = cross_val_score(model, X_vec, y, cv=cv, scoring='precision_weighted')
Then, I got a warning that says
UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
_warn_prf(average, modifier, msg_start, len(result))
What should I do? Where did I go wrong?
Upvotes: 0
Views: 47
Reputation: 311
Firstly this is just an warning and not an error.Some labels don' appear in the predicted samples. This means that the accuracy calculated for those labels are set to 0.0
As I mentioned, this is a warning, which is treated differently from an error in python. The default behavior in most environments is to show a specific warning only once. This behavior can be changed:
import warnings
warnings.filterwarnings('ignore') # "error", "ignore", "always", "default", "module" or "once"
What you can do, is not be interested in the scores of labels that were not predicted, and then explicitly specify the labels you are interested in.
Upvotes: 1