dEBA M
dEBA M

Reputation: 507

How can I solve this unknown label type error?

I am working on feature selection from the NSL-KDD dataset. After preprocessing, my X-DoS has type of data like this:

type_of_target(X_newDoS)
'continuous-multioutput'

and Y_DoS as

type_of_target(Y_DoS)
'unkonwn'

I run the feature selection part as:

from sklearn.feature_selection import RFE
from sklearn.ensemble import RandomForestClassifier

clf =RandomForestClassifier( n_jobs = 2)

rfe = RFE(clf, n_features_to_select=1)
rfe.fit(X_newDoS, Y_DoS)

The error message:

ValueError                                Traceback (most recent call 
last)
<ipython-input-31-6c22f9cc2bba> in <module>()
     12 rfe = RFE(clf, n_features_to_select=1)
---> 13 rfe.fit(X_newDoS, Y_DoS)
     14

4 frames
/usr/local/lib/python3.6/dist-packages/sklearn/utils/multiclass.py in 
check_classification_targets(y)
    167     if y_type not in ['binary', 'multiclass', 'multiclass- 
multioutput',
    168                       'multilabel-indicator', 'multilabel- 
sequences']:
--> 169         raise ValueError("Unknown label type: %r" % y_type)
    170  

ValueError: Unknown label type: 'unknown'

X_newDoS is a numpy array and Y_DoS is an array of dimension (125972,2). Clicking on the multiclass.py file, I saw there was no 'unknown' type in the list. I tried to convert the Y_DoS array into a numpy array with:

Y_DoS = np.array(Y_DoS)

Still it is an unknown data type and can't be recognized by the multiclass.py file. What are the ways I can solve this problem? How do I make the Y_DoS variable to another type recognizable by multiclass.py file without losing its contents and structures? For reference I used the code from this link and have done the same steps for preprocessing. https://github.com/CynthiaKoopman/Network-Intrusion-Detection/blob/master/DecisionTree_IDS.ipynb

I am pretty new to machine learning. The program worked fine with numpy 1.11.3, sklearn 0.18.1 and pandas 1.19.2. When working with the current preinstalled libraries versions of colab (numpy 0.24.2, sklearn 1.16.3, pandas 0.21.1), it raises the error mentioned above.

Upvotes: 3

Views: 4260

Answers (1)

dEBA M
dEBA M

Reputation: 507

Nevermind. It seems the Y_DoS variable happened to be an undefined object, so sklearn could not recognize its type. Adding

Y_DoS = Y_DoS.astype('int') 

before learning step solved the problem and classified Y_DoS as 'binary' type.

Upvotes: 3

Related Questions