Reputation: 85
I want to write a one-liner to calculate a confusion/contingency matrix M (square matrix with either dimension equal to the number of classes ) that counts the cases presented in two vectors of lenght n: Ytrue and Ypredicted. Obiously the following does not work using python and numpy:
error = N.array([error[x,y]+1 for x, y in zip(Ytrue,Ypredicted)]).reshape((n,n))
Any hint to create a one-liner matrix confusion calculator?
Upvotes: 0
Views: 2802
Reputation: 56
If NumPy is newer or equal than 1.6 and Ytrue and Ypred are NumPy arrays, this code works
np.bincount(n * (Ytrue - 1) + (Ypred -1), minlength=n*n).reshape(n, n)
Upvotes: 2
Reputation: 30240
error = N.array([zip(Ytrue,Ypred).count(x) for x in itertools.product(classes,repeat=2)]).reshape(n,n)
or
error = N.array([z.count(x) for z in [zip(Ytrue,Ypred)] for x in itertools.product(classes,repeat=2)]).reshape(n,n)
The latter being more efficient but possibly more confusing.
import numpy as N
import itertools
Ytrue = [1,1,1,1,1,1,1,1,
2,2,2,2,2,2,2,2,
3,3,3,3,3,3,3,3]
Ypred = [1,1,2,1,2,1,3,1,
2,2,2,2,2,2,2,2,
3,3,2,2,2,1,1,1]
classes = list(set(Ytrue))
n = len(classes)
error = N.array([zip(Ytrue,Ypred).count(x) for x in itertools.product(classes,repeat=2)]).reshape(n,n)
print error
error = N.array([z.count(x) for z in [zip(Ytrue,Ypred)] for x in itertools.product(classes,repeat=2)]).reshape(n,n)
print error
Which produces
[[5 2 1]
[0 8 0]
[3 3 2]]
[[5 2 1]
[0 8 0]
[3 3 2]]
Upvotes: 4