Reputation: 57
I have a league table (as a result of the netleague
function from the netmeta
package) that looks like this (values are made up in this example to simplify how it looks):
A | 0.3 (-0.4 to 0.6) | . |
0.1 (-0.9 to 0.3) | B | -0.6 (-0.9 to 0.0 |
0.2 (-0.8 to 0.4) | 0.3 (-0.6 to 0.1) | C |
Where the lower triangle represents the indirect comparison (those I am interested in for this example) and reads column vs row. For instance, 0.1 (-0.9 to 0.3)
is A vs B
.
These values are stored in R's environment as a data.frame
within a list
of elements.
What I would need to have as a result is:
char1 | char2 | value1 | value2 | value3 |
---|---|---|---|---|
A | B | 0.1 | 0.9 | 0.3 |
A | C | 0.2 | -0.8 | 0.4 |
B | A | -0.1 | -0.3 | 0.9 |
B | C | 0.3 | -0.6 | 0.1 |
C | A | -0.2 | -0.4 | 0.8 |
C | B | -0.3 | -0.1 | 0.6 |
Please note that to obtain B vs A, I inverted the values. Depending on the nature of the outcome, I might do 1/value instead.
I tried having a look at the code behind the netleague
function but it is quite advanced for me to debunk it.
Does anyone have any clues on how to automate this task using R?
So far, my best option has been to either do this manually (time-consuming with more than 8000 values and likely to introduce human-made typos) or at least using some formulae in Excel (still quite time-consuming, I had to adapt the formulae row-by-row).
Data in reproducible format: (source: dput(league[["fixed"]][1:4, ])
)
structure(list(V1 = c("dalia", "0.08 (-0.9 to 0.26)",
"-0.15 (-0.40 to 0.06)", "0.37 ( 0.00 to 0.78)"), V2 = c("-0.05 (-0.33 to 0.22)",
"camelia", "-0.24 (-0.49 to -0.01)", "0.31 (-0.09 to 0.75)"
), V3 = c("-0.14 (-0.64 to 0.32)", "-0.37 (-0.66 to -0.05)",
"margher", "0.54 ( 0.12 to 0.95)"), V4 = c(".", ".", ".",
"rosa_can"), V5 = c(".", ".", ".", "."), V6 = c(".", ".", ".",
"."), V7 = c(".", ".", ".", "."), V8 = c("0.65 ( 0.54 to 0.87)",
"0.54 ( 0.38 to 0.78)", "0.77 ( 0.2 to 1.28)", "0.29 (-0.08 to 0.67)"
), V9 = c(".", ".", ".", "."), V10 = c(".", ".", ".", "."), V11 = c(".",
".", ".", "."), V12 = c("0.23 (-0.52 to 0.99)", ".", "-0.05 (-0.56 to 0.47)",
"."), V13 = c(".", "0.07 (-0.25 to 0.33)", ".", ".")), row.names = c(NA,
4L), class = "data.frame")
Upvotes: 0
Views: 168
Reputation: 79228
You could try:
ff <- function(x, y, z){
sprintf("%.1f (%.1f to %.1f)", -as.numeric(x),
-as.numeric(z), -as.numeric(y))
}
DT <- as.matrix(DT)
rownames(DT) <- colnames(DT) <- diag(DT)
pattern <- r"((\S+)\s+\(\s*(\S+)\s*to\s*(\S+)\))"
DT[upper.tri(DT)] <- gsubfn::gsubfn(pattern,ff, t(DT)[upper.tri(DT)])
with(subset(as.data.frame.table(DT), Var1!=Var2),
cbind(Var2, Var1, read.table(text=gsub("[()to]", " ", Freq))))
Var2 Var1 V1 V2 V3
1 A B 0.1 -0.9 0.3
2 A C 0.2 -0.8 0.4
3 B A -0.1 -0.3 0.9
4 B C 0.3 -0.6 0.1
5 C A -0.2 -0.4 0.8
6 C B -0.3 -0.1 0.6
Upvotes: 0
Reputation: 27732
Here is a first go at your problem, but I could not exact match your desired output. SO please point out the mistakes, so I can have another round at this..
library(data.table)
# Sample data
DT <- fread('"A" "0.3 (-0.4 to 0.6)" "."
"0.1 (-0.9 to 0.3)" "B" "-0.6 (-0.9 to 0.0)"
"0.2 (-0.8 to 0.4)" "0.3 (-0.6 to 0.1)" "C"', header = FALSE)
# Code
# Get names, set as row and colnames
names.v <- diag(as.matrix(DT))
setnames(DT, new = names.v)
DT[, char2 := names.v]
# A B C rowname
# 1: A 0.3 (-0.4 to 0.6) . A
# 2: 0.1 (-0.9 to 0.3) B -0.6 (-0.9 to 0.0) B
# 3: 0.2 (-0.8 to 0.4) 0.3 (-0.6 to 0.1) C C
# Melt to long
ans <- setcolorder(melt(DT, id.vars = "char2", variable.name = "char1"), c(2,1,3))
# char1 char2 value
# 1: A A A
# 2: A B 0.1 (-0.9 to 0.3)
# 3: A C 0.2 (-0.8 to 0.4)
# 4: B A 0.3 (-0.4 to 0.6)
# 5: B B B
# 6: B C 0.3 (-0.6 to 0.1)
# 7: C A .
# 8: C B -0.6 (-0.9 to 0.0)
# 9: C C C
# keep relevant rows
ans <- ans[!char1 == char2, ]
# extract numeric values
ans[, paste0("val", 1:length(tstrsplit(ans$value, "[^0-9-\\.]+", perl = TRUE))) :=
tstrsplit(ans$value, "[^0-9-\\.]+", perl = TRUE)][]
# char1 char2 value val1 val2 val3
# 1: A B 0.1 (-0.9 to 0.3) 0.1 -0.9 0.3
# 2: A C 0.2 (-0.8 to 0.4) 0.2 -0.8 0.4
# 3: B A 0.3 (-0.4 to 0.6) 0.3 -0.4 0.6
# 4: B C 0.3 (-0.6 to 0.1) 0.3 -0.6 0.1
# 5: C A . . <NA> <NA>
# 6: C B -0.6 (-0.9 to 0.0) -0.6 -0.9 0.0
Upvotes: 0