zrbecker
zrbecker

Reputation: 1582

Training a neural network to compute 'XOR' in scikit-learn

I am trying to learn how to use scikit-learn's MLPClassifier. For a very simple example, I thought I'd try just to get it to learn how to compute the XOR function, since I have done that one by hand as an exercise before.

However, it just spits out zeros after I try to fit the model.

xs = np.array([
    0, 0,
    0, 1,
    1, 0,
    1, 1
]).reshape(4, 2)

ys = np.array([0, 1, 1, 0]).reshape(4,)

model = sklearn.neural_network.MLPClassifier(
    activation='logistic', max_iter=10000, hidden_layer_sizes=(4,2))
model.fit(xs, ys)

print('score:', model.score(xs, ys)) # outputs 0.5
print('predictions:', model.predict(xs)) # outputs [0, 0, 0, 0]
print('expected:', np.array([0, 1, 1, 0]))

I put my code in a jupyter notebook on github as well https://gist.github.com/zrbecker/6173ac01ed30be4eea9cc96e21f4896f

Why can't scikit-learn come to a solution, when I can show explicitly that one exists? Is the cost function getting stuck in a local minimum? Is there some kind of regularization happening on the parameters that force them to stay close to 0? The parameters I used were reasonably large (i.e. -30 to 30).

Upvotes: 4

Views: 5823

Answers (4)

haalcala
haalcala

Reputation: 63

Is there a magic sequence of parameters to allow the model to infer correctly from the data it hasn't seen before? None of the solution mentioned above doesn't seem to work.

from sklearn.ensemble import RandomForestClassifier
from sklearn.neural_network import MLPClassifier

# clf = RandomForestClassifier(random_state=0)
# clf = MLPClassifier(activation='logistic', max_iter=100, hidden_layer_sizes=(2,), alpha=0.001, solver='lbfgs', verbose = True)
clf = MLPClassifier(
                activation='logistic',
                max_iter=100,
                hidden_layer_sizes=(2,),
                solver='lbfgs')
X = [[ 0,  0],  # 2 samples, 3 features
     [0, 1],
#      [1, 0],
    [1, 1]]
y = [1, 
     0,
#      1,
     1]  # classes of each sample
clf.fit(X, y)

assert clf.predict([[0, 1]]) == [0]
assert clf.predict([[1, 0]]) == [0]

Upvotes: 0

GadiHerman
GadiHerman

Reputation: 61

The following is a simple example of XOR classification by sklearn.neural_network

import numpy as np
import sklearn.neural_network

inputs = np.array([[0,0],[0,1],[1,0],[1,1]])
expected_output = np.array([0,1,1,0])

model = sklearn.neural_network.MLPClassifier(
                activation='logistic',
                max_iter=100,
                hidden_layer_sizes=(2,),
                solver='lbfgs')
model.fit(inputs, expected_output)
print('predictions:', model.predict(inputs))

Upvotes: 0

Sergii Zhyla
Sergii Zhyla

Reputation: 11

Actually the point here is about 'solver' which is by default = 'adam' and works well for large data sets. Bigger 'alpha' should also improve:

MLPClassifier(activation='logistic', max_iter=100, hidden_layer_sizes=(3,), alpha=0.001, solver='lbfgs', verbose = True)

And by the way it's possible to solve this issue with only 3 elements in one hidden layer with

Upvotes: 1

cs95
cs95

Reputation: 402263

It appears a logistic activation is the root cause here.

Change your activation to either tanh or relu (my favourite). Demo:

model = sklearn.neural_network.MLPClassifier(
    activation='relu', max_iter=10000, hidden_layer_sizes=(4,2))
model.fit(xs, ys)

Outputs for this model:

score: 1.0
predictions: [0 1 1 0]
expected: [0 1 1 0]

It's always a good idea to experiment with different network configurations before you settle on the best one or give up altogether.

Upvotes: 5

Related Questions