Reputation: 485
Hi I am using R and have a cluster assignment matrix that comes out of my clustering function. (I am applying a clustering algorithm on a gaussian mixture data) I want to create a data matrix of clusters. Here is a toy example of what I want to do.
#simulate data
dat=Z<-c(rnorm(2,0,1),rnorm(2,2,3),rnorm(3,0,1),rnorm(3,2,3))
dat
[1] -0.5350681 1.0444655 2.9229136 8.2528266 -0.7561170 -1.0240702 -1.0012780
[8] -0.1322981 7.8525855 2.2278264
# Making up a cluster assignment matrix (actually this one comes out of my
#clustering function
amat<-matrix(c(1,1,0,0,1,1,1,0,0,0,0,0,1,1,0,0,0,1,1,1), ncol=2, nrow=10)
amat
[,1] [,2]
[1,] 1 0
[2,] 1 0
[3,] 0 1
[4,] 0 1
[5,] 1 0
[6,] 1 0
[7,] 1 0
[8,] 0 1
[9,] 0 1
[10,] 0 1
I want to create dataframe or vector called (say) "clust" that contains cluster labels as follows using the assignment matrix given above.Basically it uses first column and second column of assignment matrix and assigns label 1 to data coming from normal distribution N(0,1) and assigns label 2 to the data coming from normal distribution N(2,3).Any help is appreciated. Thanks in advance.
# clust should look like this (I have no idea how to create this using amat and dat)
clust
[1] 1 1 2 2 1 1 1 2 2 2
Upvotes: 0
Views: 975
Reputation: 77485
Isn't this essentially
1 * column1 + 2 * column2 +3 * column3 and so on?
that should be straight forward to write as a matrix multiplocation woth [1,2,3,4,...] and a sum operation.
Upvotes: 1
Reputation: 28461
The vector is already binary. We can add 1L
to the second column:
clust <- amat[,2] + 1L
[1] 1 1 2 2 1 1 1 2 2 2
(The suffix L
coerces the value to integer
)
Upvotes: 1