Franco Piccolo
Franco Piccolo

Reputation: 7410

Should I cast target class as float or integer?

In sklearn, should I cast the target class as an integer or float? Will it make a difference?

I'm asking because I'm training a Neural Network and read in this question that having a class casted as float may cause issues.

Then according to this question I think the answer is as Integer, but I would like to know if this is the case and why.

Upvotes: 1

Views: 2709

Answers (1)

Luca Massaron
Luca Massaron

Reputation: 1809

In Scikit-learn, it is indifferent to cast the target class to a float or int type (or even a string, see: Is numerical encoding necessary for the target variable in classification?), they are both allowed. You only have to notice that classification targets will be maintained in the same type as the input, thus if your input was a float type, you will get a float vector of predictions (see: https://scikit-learn.org/stable/tutorial/basic/tutorial.html#type-casting).

In this example, you will directly verify how KNeighborsClassifier will produce the same class predictions (but with different data types, depending on the target class input type):

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier

data = load_iris()
(X_train, X_test, 
 y_train, y_test) = train_test_split(data.data,
                                     data.target,
                                     test_size=0.33,
                                     random_state=42)
neigh = KNeighborsClassifier(n_neighbors=3)

neigh.fit(X_train, y_train.astype(int))
int_preds = neigh.predict(X_test)

neigh.fit(X_train, y_train.astype(float))
float_preds = neigh.predict(X_test)

print(int_preds.dtype, float_preds.dtype)
print("Same classes:", (int_preds == float_preds).all())

Upvotes: 1

Related Questions