Alex
Alex

Reputation: 1

ValueError: all features must be in [0, -1] or [-0, 0]

I have the following code for training IDS

# Importing the KDCup99 dataset
dataset = pd.read_csv(r'C:\Users\Ahmad\Desktop\Thiese\KDDCup99.csv',on_bad_lines='skip')

X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, 41:42].values

# Spliting the dataset into training and test sets
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 0)

''' Data Preprocessing '''

# Applying ColumnTransformer to the categorical columns of X_train and X_test
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(transformers = [('encoder', OneHotEncoder(), [1, 2, 3])], remainder = 'passthrough')
X_train = ct.fit_transform(X_train)

But the result keeps showing the following error

IndexError: index 1 is out of bounds for axis 0 with size 0


The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  File ~\untitled0.py:29 in <module>
    X_train = ct.fit_transform(X_train)

  File ~\anaconda3\lib\site-packages\sklearn\compose\_column_transformer.py:687 in fit_transform
    self._validate_column_callables(X)

  File ~\anaconda3\lib\site-packages\sklearn\compose\_column_transformer.py:374 in _validate_column_callables
    transformer_to_input_indices[name] = _get_column_indices(X, columns)

  File ~\anaconda3\lib\site-packages\sklearn\utils\__init__.py:384 in _get_column_indices
    raise ValueError(

ValueError: all features must be in [0, -1] or [-0, 0]

What is the problem here?

Upvotes: 0

Views: 670

Answers (1)

MosQuan
MosQuan

Reputation: 140

Problem here is in the filename.

To reproduce I followed http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html and downloaded dataset there, changing your first line to

dataset = pd.read_csv(r'kddcup.data_corrected',on_bad_lines='skip')

and executing the rest of your code gives no errors. Caution the file is csv but has no extension '.csv' ender.

To avoid such problems further you can check that data was read correctly with

dataset.shape

providing its output is useful in such error.

Upvotes: 0

Related Questions