Reputation: 151
Recently I read a line of code that confused me:
pointsInCurrCluster = dataSet[nonzero(clusterAssment[:, 0].A == i)[0], :]
The author did not define the function for A
, so I assume that .A
is some kind of built-in function. Does anyone know what it is?
Upvotes: 1
Views: 6442
Reputation: 11
So, in python you can get result of a conditional check on each element of an array by writing a statement like: arr > 3. What it does is, for an array arr = [[1,2,3],[3,4,5]] the output you will have is [[False,False,False],[False,True,True]]. Now having said that you need an array to do this. That is what a .A does in python, it gives you an array representation of the matrix. Now, clusterAssment[:, 0].A == i, gives you conditional check answer for every row and first column against the value i. nonzero(clusterAssment[:, 0].A == i) converts the conditional check to index of rows and columns which Satisfy the condition. More details here: nonZero . Now, since clusterAssment is a 2-D array the nonzero(~)[0] gives the rows which have the value i as 1st element, and dataSet[nonzero(clusterAssment[:, 0].A == i)[0], :] gives all those respective tuples from dataset.
Upvotes: 0
Reputation: 231510
In https://github.com/skodali1/python-machine-learning/blob/master/kmeansclusteringalgo.py (found by google search for 'python clusterAssment'
from numpy import *
clusterAssment = matrix(zeros((m,2)))
...
ptsInClust = dataSet[nonzero(clusterAssment[:,0].A==cent)[0]]
In this case clusterAssment
is a numpy.matrix
object. This is like a numpy.ndarray
, except it is always 2d, and has MATLAB like matrix operators.
clusterAssment.A
just turns the matrix into a regular numpy.array
, probably so it can be passed to numpy.nonzero
.
scipy.sparse
implements sparse matrices, which also have this .A
property. But based on this code, I don't think that applies here.
Upvotes: 1