Reputation: 2134
I am severly struggling with a data-rearrangement problem. The data below contains agreements (rows) which collapsed or were stable (column "collapse") and feature provisions which were reduced, kept, added or absent (columns "diff.pps_leadership","diff.pps_cabinet", etc.)
I want to rearrange the data so that I get an overview of how many % of those agreements which reduced, kept, or added a specific provision collapsed. The rows should be the provisions (diff.pps_leadership...), the columns should be "reduced, "kept", and "added". And the content of the cells should be the % of those collapsed (only in relation to those which reduced, kept, or added the provision; not the total).
In Excle I would do this in pivot table, but I haven't been able to get there with R. I tried the cast, aggregate, melt, and transpose commands, but haven't succeeded.
Eventually, the result should look similar to this https://docs.google.com/spreadsheets/d/1yhIbvTQTYkkwSFVxWEnPwvSvwTc0vuTYZxa15Eh1lT8/edit?usp=sharing
Hope my question is not too specific. Grateful for any hint/advice.
example <- structure(list(Agreement = structure(c(8L, 4L, 6L, 9L, 2L, 3L,
7L, 10L, 5L, 1L), .Label = c("Abuja Agreement", "Accra Peace Agreement",
"Arusha Agreement", "Arusha/Global Ceasefire Agreement", "Comprehensive Peace Agreement",
"InterabsentCongolese Dialogue", "Lome Agreement", "Lusaka Protocol",
"Ouagadougou Agreement", "Tansitional Constituion"), class = "factor"),
diff.pps_cabinet = structure(c(2L, 1L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L), .Label = c("kept", "reduced"), class = "factor"),
diff.pps_leadership = structure(c(1L, 2L, 3L, 3L, 3L, 3L,
3L, 3L, 2L, 3L), .Label = c("absent", "kept", "reduced"), class = "factor"),
diff.mps_milcmd = structure(c(3L, 2L, 3L, 3L, 3L, 3L, 1L,
3L, 2L, 3L), .Label = c("absent", "kept", "reduced"), class = "factor"),
diff.mps_armyint = structure(c(3L, 2L, 2L, 3L, 3L, 3L, 1L,
3L, 2L, 3L), .Label = c("absent", "kept", "reduced"), class = "factor"),
diff.eps_commission = structure(c(1L, 1L, 1L, 1L, 3L, 1L,
3L, 1L, 2L, 3L), .Label = c("absent", "kept", "reduced"), class = "factor"),
diff.eps_company = structure(c(1L, 2L, 1L, 1L, 3L, 1L, 1L,
1L, 2L, 3L), .Label = c("absent", "kept", "reduced"), class = "factor"),
diff.veto_leg = structure(c(1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L), .Label = c("absent", "added"), class = "factor"),
diff.tps_devolution = structure(c(2L, 1L, 2L, 3L, 1L, 1L,
1L, 2L, 2L, 1L), .Label = c("absent", "kept", "reduced"), class = "factor"),
diff.ca.psh = structure(c(3L, 2L, 1L, 1L, 4L, 1L, 1L, 1L,
4L, 1L), .Label = c("absent", "added", "kept", "reduced"), class = "factor"),
collapse = structure(c(1L, 2L, 2L, 1L, 2L, 1L, 1L, 2L, 2L,
1L), .Label = c("collapse", "stable"), class = "factor")), .Names = c("Agreement",
"diff.pps_cabinet", "diff.pps_leadership", "diff.mps_milcmd",
"diff.mps_armyint", "diff.eps_commission", "diff.eps_company",
"diff.veto_leg", "diff.tps_devolution", "diff.ca.psh", "collapse"
), class = "data.frame", row.names = c(NA, -10L))
Upvotes: 0
Views: 75
Reputation: 34763
The following gets the job done.
library(data.table)
setDT(example)
mvs <- c("diff.pps_cabinet", "diff.pps_leadership",
"diff.mps_milcmd", "diff.mps_armyint")
vls <- c("reduced", "kept", "added", "absent")
melt(example, c("Agreement", "collapse"), mvs
)[ , setNames(vapply(
vls, function(vv) list(paste0(
s <- sum(collapse[idx <- value == vv] == "collapse"),
" out of ", sum(idx), " = ", floor(100 * s / sum(idx)), "% collapsed"),
paste(Agreement[idx], collapse = "\n")),
vector("list", 2)),
paste0(rep(vls, each = 2),
c(".percent", ".names"))), by = variable]
Current prints NaN
when there's nothing; to fix this, replace sum(idx)
in the denominator by (if (!any(idx)) 1 else sum(idx))
.
Upvotes: 1