Reputation: 3
I am a beginner of learning nlp, and I am trying to classify a dataset with GaussianNB()
and evaluate by f1_score
. I got this TypeError
when calling the f1_score
function and here is my code:
dev_X_train, dev_X_test, dev_y_train, dev_y_test = train_test_split(dev_X, dev_y, test_size = 0.2, random_state =0)
classifier = GaussianNB()
dev_y_train = dev_y_train.astype(numpy.int)
dev_y_test = dev_y_test.astype(numpy.int)
classifier.fit(dev_X_train, dev_y_train)
dev_y_pred = classifier.predict(dev_X_test)
dev_y_pred = dev_y_pred.astype(numpy.int)
score = f1_score(dev_y_test, dev_y_pred, pos_label=1)
print('F1 Score: %.3f' % dev_y_pred)
and this is what the training and testing data look like.
dev_X_train:
<class 'numpy.ndarray'> len=80
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]]
dev_y_train:
<class 'numpy.ndarray'> len=80
[1 1 0 0 0 0 1 1 1 1 1 0 1 0 0 0 0 0 0 0 1 1 1 0 1 0 1 1 1 1 1 0 1 0 0 0 1
1 1 0 1 1 1 0 1 0 1 1 0 1 0 1 1 1 0 0 0 1 1 1 0 1 1 0 1 1 0 1 1 0 0 0 0 0
1 0 0 1 0 1]
dev_X_test:
<class 'numpy.ndarray'> len=20
[[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
...
[0 0 0 ... 0 0 0]
[0 0 0 ... 0 0 0]
[1 0 0 ... 0 0 0]]
dev_y_test:
<class 'numpy.ndarray'> len=20 [1 1 1 0 1 0 0 1 0 0 0 0 0 1 1 0 0 0 0 0]
dev_y_pred:
<class 'numpy.ndarray'> len=20 [1 0 1 0 0 0 1 1 1 0 0 0 0 0 1 0 0 0 0 0]
I have tried .astype(numpy.int)
as others said, but it still has the same outcome. So, could you please explain why this happens and how to fix it?
here is the full Traceback:
Traceback (most recent call last):
File "/Users/chenchiyu/Desktop/COMP90042 NLP/Project/proj.py", line 241, in <module>
print('F1 Score: %.3f' % dev_y_pred)
TypeError: only size-1 arrays can be converted to Python scalars
Upvotes: 0
Views: 311
Reputation: 3855
Did you mean to format your print string with the score
variable instead? The error is with your print
call, not the f1_score
call, as seen from the stack trace. You're receiving this error because you used a format specifier for a single float and you're trying to insert an entire array (dev_y_pred
) rather than a single scalar value. Maybe you meant to do this: print('F1 Score: %.3f' % score)
Upvotes: 1