Reputation: 63
I am trying to train and test a simple multi-layer perceptron, exactly as in the first Chainer tutorial, but with my own dataset instead of MNIST. This is the code I'm using (mostly from the tutorial):
class MLP(Chain):
def __init__(self, n_units, n_out):
super(MLP, self).__init__()
with self.init_scope():
self.l1 = L.Linear(None, n_units)
self.l2 = L.Linear(None, n_units)
self.l3 = L.Linear(None, n_out)
def __call__(self, x):
h1 = F.relu(self.l1(x))
h2 = F.relu(self.l2(h1))
y = self.l3(h2)
return y
X, X_test, y, y_test, xHeaders, yHeaders = load_train_test_data('xHeuristicData.csv', 'yHeuristicData.csv')
print 'dataset shape X:', X.shape, ' y:', y.shape
model = MLP(100, 1)
optimizer = optimizers.SGD()
optimizer.setup(model)
train = tuple_dataset.TupleDataset(X, y)
test = tuple_dataset.TupleDataset(X_test, y_test)
train_iter = iterators.SerialIterator(train, batch_size=100, shuffle=True)
test_iter = iterators.SerialIterator(test, batch_size=100, repeat=False, shuffle=False)
updater = training.StandardUpdater(train_iter, optimizer)
trainer = training.Trainer(updater, (10, 'epoch'), out='result')
trainer.extend(extensions.Evaluator(test_iter, model))
trainer.extend(extensions.LogReport())
trainer.extend(extensions.PrintReport(['epoch', 'main/accuracy', 'validation/main/accuracy']))
trainer.extend(extensions.ProgressBar())
trainer.run()
print 'Predicted value for a test example'
print model(X_test[0])
Instead of training and printing the predicted value, I get the following error at "trainer.run()":
dataset shape X: (1003, 116) y: (1003,)
Exception in main training loop: __call__() takes exactly 2 arguments (3 given)
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/chainer/training/trainer.py", line 299, in run
update()
File "/usr/local/lib/python2.7/dist-packages/chainer/training/updater.py", line 223, in update
self.update_core()
File "/usr/local/lib/python2.7/dist-packages/chainer/training/updater.py", line 234, in update_core
optimizer.update(loss_func, *in_arrays)
File "/usr/local/lib/python2.7/dist-packages/chainer/optimizer.py", line 534, in update
loss = lossfun(*args, **kwds)
Will finalize trainer extensions and updater before reraising the exception.
Traceback (most recent call last):
File "trainHeuristicChainer.py", line 76, in <module>
trainer.run()
File "/usr/local/lib/python2.7/dist-packages/chainer/training/trainer.py", line 313, in run
six.reraise(*sys.exc_info())
File "/usr/local/lib/python2.7/dist-packages/chainer/training/trainer.py", line 299, in run
update()
File "/usr/local/lib/python2.7/dist-packages/chainer/training/updater.py", line 223, in update
self.update_core()
File "/usr/local/lib/python2.7/dist-packages/chainer/training/updater.py", line 234, in update_core
optimizer.update(loss_func, *in_arrays)
File "/usr/local/lib/python2.7/dist-packages/chainer/optimizer.py", line 534, in update
loss = lossfun(*args, **kwds)
TypeError: __call__() takes exactly 2 arguments (3 given)
I have no clue about how to deal with the error. I have successfully trained similar networks using other frameworks, but I am interested in Chainer because it is PyPy-compatible.
A tgz with the files is available here: https://mega.nz/#!wwsBiSwY!g72pC5ZgekeMiVr-UODJOqQfQZZU3lCqm9Er2jH4UD8
Upvotes: 0
Views: 361
Reputation: 26
You are sending a tuple of (X, y)
into the MLP, while the implemented __call__
accepts only an x
.
You can modify the implementation into
class MLP(Chain):
def __init__(self, n_units, n_out):
super(MLP, self).__init__()
with self.init_scope():
self.l1 = L.Linear(None, n_units)
self.l2 = L.Linear(None, n_units)
self.l3 = L.Linear(None, n_out)
def __call__(self, x, y):
h1 = F.relu(self.l1(x))
h2 = F.relu(self.l2(h1))
predict = self.l3(h2)
loss = F.squared_error(predict, y)
// or you can write it on your own as follows
// loss = F.sum(F.square(predict - y))
return loss
It may be different in chainer than other frameworks that by default the standard updater assumes __call__
to be the loss function. So the call model(X, y)
will return the loss of the current mini-batch. That's why the chainer tutorial introduces another Classifier
class to calculate the loss function and keep the MLP simple. Classifier is meaningful in MNIST but will not suit your task, so you are on your own to implement the loss function.
When you have finished training, you can just save the model instance (maybe by adding an extension of snapshot_object into the trainer).
To use the saved model, like in testing, you have to write another method in the class maybe named as test
with the identical codes as your current __call__
, which only has X
input at hand and thus no other y
is required.
Furthermore, if you do not like to add any extra method into MLP class, making it pure, you then need to write the updater on your own and compute the loss function more naturally. To inherit the standard one is easier, you may write it as follows,
class MyUpdater(chainer.training.StandardUpdater):
def __init__(self, data_iter, model, opt, device=-1):
super(MyUpdater, self).__init__(data_iter, opt, device=device)
self.mlp = model
def update_core(self):
batch = self.get_iterator('main').next()
x, y = self.converter(batch, self.device)
predict = self.mlp(x)
loss = F.squared_error(predict, y)
self.mlp.cleargrads()
loss.backward()
self.get_iterator('main').update()
updater = MyUpdater(train_iter, model, optimizer)
Upvotes: 1