Reputation: 5356
I am trying to use a Neural network for a classification problem. I have 6 possible classes and the same input may be in more than one class.
The problem is that when I try to train one NN for each class, I set output_num_units = 1 and on train, I pass the first column of y, y[:,0]. I get the following output and error:
## Layer information
# name size
--- ------ ------
0 input 32
1 dense0 32
2 output 1
IndexError: index 1 is out of bounds for axis 1 with size 1
Apply node that caused the error: CrossentropyCategorical1Hot(Elemwise{Composite{scalar_sigmoid((i0 + i1))}}[(0, 0)].0, y_batch)
Inputs types: [TensorType(float32, matrix), TensorType(int32, vector)]
Inputs shapes: [(128, 1), (128,)]
Inputs strides: [(4, 4), (4,)]
Inputs values: ['not shown', 'not shown']
If I try to use output_num_units=num_class
(6) and the full y (all six fields), first I get an error of the KStratifiedFold, because it seems that it does not expect y to have multiple rows. If I set eval_size=None
, than I get the following error:
TypeError: ('Bad input argument to theano function with name "/usr/local/lib/python2.7/site-packages/nolearn-0.6a0.dev0-py2.7.egg/nolearn/lasagne/base.py:311"
at index 1(0-based)', 'Wrong number of dimensions: expected 1, got 2 with shape (128, 6).')
The only configuration that is working is setting more than one output unit and passing only one column to y. Than it trains the NN, but is does not seem to be right as it is giving me 2 output layers, and I have only one Y to compare to.
What am I doing wrong? Why can't I use only one output? Should I convert my y classes from a vector of 6 columns to a vector of only one column with a number?
I use the following code (extract):
# load data
data,labels = prepare_data_train('../input/train/subj1_series1_data.csv')
# X_train (119496, 32) <type 'numpy.ndarray'>
X_train = data_preprocess_train(data)
#print X_train.shape, type(X_train)
# y (119496, 6) <type 'numpy.ndarray'>
y = labels.values.astype(np.int32)
print y.shape, type(y)
# net config
num_features = X_train.shape[1]
num_classes = labels.shape[1]
# train neural net
layers0 = [('input', InputLayer),
('dense0', DenseLayer),
('output', DenseLayer)]
net1 = NeuralNet(
layers=layers0,
# layer parameters:
input_shape=(None, num_features), # 32 input
dense0_num_units = 32, # number of units in hidden layer
output_nonlinearity=sigmoid, # sigmoid function as it has only one class
output_num_units=2 , # if I try 1, it does not work
# optimization method:
update=nesterov_momentum,
update_learning_rate=0.01,
update_momentum=0.9,
max_epochs=50, # we want to train this many epochs
verbose=1,
eval_size=0.2
)
net1.fit(X_train, y[:,0])
Upvotes: 2
Views: 496
Reputation: 31
I then wanted to use CNNs in Lasagne, but didn't get it to work the same way, as the predictions were always 0... recommend you to look at the MNIST example. I find that one much better to use and to extend, as old code snippets didn't fully work due to API changes over time. I've amended the MNIST example, my target vector has labels 0 or 1 and create the output layer for the NN this way:
# Finally, we'll add the fully-connected output layer, of 2 softmax units:
l_out = lasagne.layers.DenseLayer(
l_hid2_drop, num_units=2,
nonlinearity=lasagne.nonlinearities.softmax)
And for the CNN:
layer = lasagne.layers.DenseLayer(
lasagne.layers.dropout(layer, p=.5),
num_units=2,
nonlinearity=lasagne.nonlinearities.softmax)
Upvotes: 1