Reputation: 9822
I am a new R-user and I have a kind of algorithm problem. I made some research on the web and on Stackoverflow, but can't find my answer.
I have a squared matrix, for example :
A B C D
A 0 0 0 1
B 0 1 1 0
C 1 0 0 0
D 0 1 1 1
This matrix represents links between keywords (A, B, C and D here). A '1' (or a TRUE) means keywords are in relation. For example, the '1' on the first row means A is linked to D.
I need to find the two most linked keywords on the matrix. I know I need to compute the scalar product between each row and the initial matrix. Then I take the sum of the rows and get the maximum. But, what is the R program which put in a new matrix the product between each row of my matrix, and the matrix itself ?
Thanks!
Upvotes: 0
Views: 664
Reputation: 226801
I thought I had a cleverer answer but it turns out to be slower ...
tmp1 <- function(a) {
n <- nrow(a)
aa <- apply(array(apply(a,1,"*",a),
rep(n,3)),3,rowSums)
apply(aa,2,which.max)
}
Previous solution:
tmp2 <- function(a) {
n <- nrow(a)
r <- numeric(n)
for(i in seq(n)) {
b <- rowSums(a[i,]*a)
r[i] <- which.max(b)
}
r
}
Test this on something reasonably large:
n <- 50
a <- matrix(0,nrow=n,ncol=n)
a[sample(length(a),size=n^2/5,replace=TRUE)] <- 1
all(tmp1(a)==tmp2(a)) ## TRUE
library(rbenchmark)
benchmark(tmp1(a),tmp2(a))
> benchmark(tmp1(a),tmp2(a))
test replications elapsed relative user.self sys.self
1 tmp1(a) 100 4.030 9.264368 2.052 1.96
2 tmp2(a) 100 0.435 1.000000 0.232 0.20
You will presumably do even better if you can do it in terms of sparse matrices.
Upvotes: 3
Reputation: 11860
Like this?
a=matrix(c(0,0,0,1,0,1,1,0,1,0,0,0,0,1,1,1), ncol=4, byrow=T)
for(i in 1:4){
b = rowSums(a[i,]*a)
print(which(b==max(b)))
}
Upvotes: 1