odonnry
odonnry

Reputation: 189

ValueError: Found input variables with inconsistent numbers of samples:

Getting value error when running the code below, I thought it would be due to the iloc code to split the data into x and y, but cant see what im doing wrong:

            if st.checkbox('Select Multiple Columns'):
                new_data = st.multiselect(
                    "Select the target columns. Please note, the target variable should be the last column selected",
                    df.columns)
                df1 = df[new_data]
                st.dataframe(df1)

                # dividing data into X and Y varibles
                x = df1.iloc[:, :-1]
                y = df1.iloc[:-1]

X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=seed)

            clf.fit(X_train, y_train)

            y_pred = clf.predict(X_test)
            st.write('Prediction:', y_pred)

Error that I get is the following:

ValueError: Found input variables with inconsistent numbers of samples: [196, 195] Traceback:

Snippet of the dataset:

1/1/20  X   2020    206457
1/1/20  X   2021    70571
1/1/20  X   2022    46918
1/1/20  X   2023    36492
1/1/20  X   2024    0
1/1/20  X   2025    0
1/1/20  X   2020    286616
1/1/20  X   2021    134276
1/1/20  X   2022    87674
1/1/20  X   2023    240
1/1/20  X   2024    0
1/1/20  X   2025    0

Upvotes: 0

Views: 8691

Answers (1)

SeaBean
SeaBean

Reputation: 23217

Check your codes of the 2 statements:

x = df1.iloc[:, :-1]
y = df1.iloc[:-1]

x and y are slicing on df1 differently. x on the entire rows while y with one row less. Hence, inconsistent numbers of samples: [196, 195] ==> 196 for x; 195 for y

Please note that the first parameter of iloc[] is slicing on rows, while the second parameter on columns.

You have x slicing all rows and one column less (without the last column), while y is slicing with one parameter only and is slicing only on row (without the last row) and it takes all columns by without specifying column slicing on the second parameter.

Upvotes: 2

Related Questions