Reputation: 365
Supposing I have a dataframe with 2 variables corresponding to 2 indices calculated for different groups A, B and C for example. So the dataframe is essentially:
>df
Group v.1 v.2
A 2 3
B 4 4
C 7 9
I would like to calculate the pair-wise difference per each variable (v.1
& v.2
) then plot the result in a cross-tabulation format, so the values below the diagonal gives the pair-wise differences in v.1
and the upper diagonal, the values for the pairwise differences in v.2
. So the result would look like:
A B C
A 0 1 6
B 2 0 5
C 5 3 0
Is there any package that would help me achieve this? Any suggestions would be welcomed.
Upvotes: 2
Views: 402
Reputation: 1
The accepted answer above by @A5C1D2H2I1M1N2O1R2T1 is not entirely correct due to index order in the upper.tri
assignment.
You will only see this when using four elements.
df <- data.frame(list(
"Group" = c("A", "B", "C", "D"),
"v.1" = c( 1, 1, 1, 3 ),
"v.2" = c( 1, 1, 1, 2 )
))
m <- matrix(0, nrow = nrow(df), ncol = nrow(df),
dimnames=list(df$Group, df$Group))
m[lower.tri(m)] <- combn(df$v.1, 2, FUN=diff)
m[upper.tri(m)] <- combn(df$v.2, 2, FUN=diff)
m
# A B C D
# A 0 0 0 0
# B 0 0 1 1
# C 0 0 0 1
# D 2 2 2 0
(see the m[2,3]
element which should be at m[1,4]
instead)
Solution: I would suggest to transpose to correct; maybe there is an easier way?
n <- matrix(0, nrow = nrow(df), ncol = nrow(df),
dimnames=list(df$Group, df$Group))
n[lower.tri(n)] <- combn(df$v.2, 2, FUN=diff)
n <- t(n)
n[lower.tri(n)] <- combn(df$v.1, 2, FUN=diff)
The other answer by @Valentin_Stefan, using the outer product, seems correct for the special case of OP question. However, note that the mat_dif1[mat_dif1<0] <- 0
step will only be appropriate if your data is strictly increasing.
df <- data.frame(list(
"Group" = c("A", "B", "C", "D"),
"v.1" = c( 1, 1, 1, 3 ),
"v.2" = c( 1, 1, 2, 1 )
))
# matrix of pairwise differences for v.1
mat_dif1 <- outer(X = df$v.1, Y = df$v.1, FUN = "-")
mat_dif1[mat_dif1<0] <- 0
# matrix of pairwise differences for v.2
mat_dif2 <- outer(X = df$v.2, Y = df$v.2, FUN = "-")
mat_dif2[mat_dif2>0] <- 0
mat_dif1 + abs(mat_dif2)
# [,1] [,2] [,3] [,4]
# [1,] 0 0 1 0
# [2,] 0 0 1 0
# [3,] 0 0 0 0
# [4,] 2 2 3 0
The outer
product is much FUN
, and one could somehow use it in comparable cases, yet for the OP I guess one cannot get by without the upper.tri
and lower.tri
.
df <- data.frame(list(
"Group" = c("A", "B", "C", "D"),
"v.1" = c( 1, 1, 1, 3 ),
"v.2" = c( 1, 1, 2, 1 )
))
mat_dif1 <- outer(X = df$v.1, Y = df$v.1, FUN = function(X, Y) abs(Y-X))
mat_dif2 <- outer(X = df$v.2, Y = df$v.2, FUN = function(X, Y) abs(Y-X))
o <- matrix(0, nrow = nrow(df), ncol = nrow(df),
dimnames=list(df$Group, df$Group))
o[lower.tri(o)] <- mat_dif1[lower.tri(mat_dif1)]
o[upper.tri(o)] <- mat_dif2[upper.tri(mat_dif2)]
o
These strategies are, of course, much more efficient if you just cross-tabulate a single array.
Upvotes: 0
Reputation: 6446
Not that smooth as the solution of @A5C1D2H2I1M1N2O1R2T1, but one could also approach this with outer
function from R base
package:
df <- read.table(text =
"Group v.1 v.2
A 2 3
B 4 4
C 7 9",
header = TRUE)
# matrix of pairwise differences for v.1
mat_dif1 <- outer(X = df$v.1, Y = df$v.1, FUN = "-")
mat_dif1[mat_dif1<0] <- 0
# matrix of pairwise differences for v.2
mat_dif2 <- outer(X = df$v.2, Y = df$v.2, FUN = "-")
mat_dif2[mat_dif2>0] <- 0
mat_dif1 + abs(mat_dif2)
## [,1] [,2] [,3]
## [1,] 0 1 6
## [2,] 2 0 5
## [3,] 5 3 0
If you need the row and column names, then:
results <- mat_dif1 + abs(mat_dif2)
dimnames(results) <- list(df$Group, df$Group)
results
## A B C
## A 0 1 6
## B 2 0 5
## C 5 3 0
Upvotes: 0
Reputation: 193627
You can probably use combn
+diff
along with upper.tri
and lower.tri
as follows:
m <- matrix(0, nrow = nrow(df), ncol = nrow(df),
dimnames=list(df$Group, df$Group))
m
# A B C
# A 0 0 0
# B 0 0 0
# C 0 0 0
m[lower.tri(m)] <- combn(df$v.1, 2, FUN=diff)
m[upper.tri(m)] <- combn(df$v.2, 2, FUN=diff)
m
# A B C
# A 0 1 6
# B 2 0 5
# C 5 3 0
Upvotes: 6