Reputation: 395
I have created a matrix that compares strings of vectors within a list to each other.
sapply(names(setlist), function(x) sapply(names(setlist), function(y) sum(setlist[[x]] %in% setlist[[y]])))
A B C D
A 50 1 0 0
B 1 6 0 0
C 0 0 51 8
D 0 0 8 46
For example the number of strings within vector A and vector B that are the exact same are 1, with 50 total strings in A and 6 within B.
I would like to normalize the data so that, using the example above. The total stings of A and B is 56, so divide 1 by 56 = .018. The end result should look something like this:
A B C D
A .5 .018 0 0
B .018 .5 0 0
C 0 0 .5 .082
D 0 0 .082 .5
Upvotes: 1
Views: 524
Reputation: 32538
#DATA
m = structure(c(50L, 1L, 0L, 0L, 1L, 6L, 0L, 0L, 0L, 0L, 51L, 8L,
0L, 0L, 8L, 46L), .Dim = c(4L, 4L), .Dimnames = list(c("A", "B",
"C", "D"), c("A", "B", "C", "D")))
Use sapply
to go through each column and normalize
sapply(X = 1:NCOL(m), function(i) round(x = m[,i]/(m[i,i]+diag(m)), digits = 3))
# [,1] [,2] [,3] [,4]
#A 0.500 0.018 0.000 0.000
#B 0.018 0.500 0.000 0.000
#C 0.000 0.000 0.500 0.082
#D 0.000 0.000 0.082 0.500
You could substitute 1 in the diagonal elements with replace
sapply(X = 1:NCOL(m), function(i)
replace(x = round(x = m[,i]/(m[i,i]+diag(m)), digits = 3),
list = i,
values = 1))
# [,1] [,2] [,3] [,4]
#A 1.000 0.018 0.000 0.000
#B 0.018 1.000 0.000 0.000
#C 0.000 0.000 1.000 0.082
#D 0.000 0.000 0.082 1.000
Upvotes: 1