Reputation: 147
I am creating a Neural Network and currently I am working on the; train, test split
section using:
import csv
import math
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
import datetime
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
X1 = Values[1:16801] #16,800 values
train_size = int(len(X1) * 0.67)
test_size = len(X1) - train_size
train, test = X1[0:train_size,:], X1[train_size:len(X1),:]
print(len(train), len(test))
I have 16,800 values for X1 which look like:
[0.03454225 0.02062136 0.00186715 ... 0.92857565 0.64930691 0.20325924]
My traceback error message is:
IndexError Traceback (most recent call last)
<ipython-input-16-8cadae12af2c> in <module>()
69 test_size = len(X1) - train_size
70
---> 71 train, test = X1[0:train_size,:], X1[train_size:len(X1),:]
72 print(len(train), len(test))
73
IndexError: too many indices for array
I'm not sure why this could be,If anyone can help, it would be really appreciated.
Upvotes: 1
Views: 242
Reputation: 776
The error is because of the below line:
train, test = X1[0:train_size,:], X1[train_size:len(X1),:]
Here, your data is a one-dimensional array as is evident from the data posted in your question.
But in the above line, you are using subscript for 2nd dimension. In the above line 0:train_size
will select indices 0 to train_size on 1st dimension and :
will select all the indices on 2nd dimension. But you don't have a 2nd dimension in your data. Below code should work.
import csv
import math
import numpy as np
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
import datetime
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
# 1D array
X1 = Values[1:16801] #16,800 values
train_size = int(len(X1) * 0.67)
test_size = len(X1) - train_size
train, test = X1[0:train_size], X1[train_size:len(X1)]
print(len(train), len(test))
Upvotes: 2