Reputation: 879
I am again struggling with transforming a wide df into a long one using pivot_longer
The data frame is a result of power analysis for different effect sizes and sample sizes, this is how the original df looks like:
es_issue_owner es_independence es_party pwr_issue_owner_1200 pwr_independence_1200 pwr_party_1200 pwr_issue_owner_2400 pwr_independence_2400 pwr_party_2400
1 0.1 0.1 0.1 0.087 0.080 0.081 0.130 0.163 0.102
2 0.2 0.2 0.2 0.235 0.273 0.157 0.406 0.513 0.267
Or with dput:
example <- structure(list(es_issue_owner = c(0.1, 0.2), es_independence = c(0.1,
0.2), es_party = c(0.1, 0.2), pwr_issue_owner_1200 = c(0.087,
0.235), pwr_independence_1200 = c(0.08, 0.273), pwr_party_1200 = c(0.081,
0.157), pwr_issue_owner_2400 = c(0.13, 0.406), pwr_independence_2400 = c(0.163,
0.513), pwr_party_2400 = c(0.102, 0.267)), row.names = 1:2, class = "data.frame")
Each effect size (es) for three meassures ("independence", "issueowner", "party") is paired with a power calculation on a 1200 and on a 2400 sample size. This is how the output I want to get would look like based on the example above:
type es pwr value
1 independence 0.1 1200 0.080
2 issue_owner 0.1 1200 0.087
3 party 0.1 1200 0.081
4 independence 0.2 1200 0.273
5 issue_owner 0.2 1200 0.235
6 party 0.2 1200 0.157
7 independence 0.1 2400 0.163
8 issue_owner 0.1 2400 0.130
9 party 0.1 2400 0.102
10 independence 0.2 2400 0.513
11 issue_owner 0.2 2400 0.406
12 party 0.2 2400 0.267
or, with dput:
output <- structure(list(type = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L,
2L, 3L, 1L, 2L, 3L), .Label = c("independence", "issueowner",
"party"), class = "factor"), es = c(0.1, 0.1, 0.1, 0.2, 0.2,
0.2, 0.1, 0.1, 0.1, 0.2, 0.2, 0.2), pwr = c(1200, 1200, 1200,
1200, 1200, 1200, 2400, 2400, 2400, 2400, 2400, 2400), value = c("0.080",
"0.087", "0.081", "0.273", "0.235", "0.157", "0.163", "0.130",
"0.102", "0.513", "0.406", "0.267")), out.attrs = list(dim = c(type = 3L,
es = 2L, pwr = 2L, value = 1L), dimnames = list(type = c("type=independence",
"type=issueowner", "type=party"), es = c("es=0.1", "es=0.2"),
pwr = c("pwr=1200", "pwr=2400"), value = "value=NA")), class = "data.frame", row.names = c(NA,
-12L))
As a start I tried experimenting with this:
example %>%
pivot_longer(cols = everything(),
names_pattern = "(es_[A-Za-z]+)(pwr_[A-Za-z]+_1200)(pwr_[A-Za-z]+_2400)",
# names_sep = "(?=\\d)_(?=\\d)",
names_to = c("es", "pwr_1200", "pwr_2400"),
values_to = "value")
But it did not work, so I tried from two steps, which sort of works, but the "pairing" gets messed up:
example %>%
# pivot_longer(cols = everything(),
# names_pattern = "(es_[A-Za-z]+)(pwr_[A-Za-z]+_1200)(pwr_[A-Za-z]+_2400)",
# # names_sep = "(?=\\d)_(?=\\d)",
# names_to = c("es", "pwr_1200", "pwr_2400"),
# values_to = "value")
pivot_longer(cols = contains("pwr_"),
# names_pattern = "es_pwr(.*)1200_pwr(.*)2400",
names_sep = "_(?=\\d)",
names_to = c("pwr_type", "pwr_sample"), values_to = "value") %>%
pivot_longer(cols = contains("es_"),
# names_pattern = "es_pwr(.*)1200_pwr(.*)2400",
# names_sep = "_(?=\\d)",
names_to = "es_type", values_to = "es")
I would appreciate any help!
Upvotes: 3
Views: 244
Reputation: 52409
library(tidyverse)
example %>%
pivot_longer(cols = starts_with("es"), names_to = "type", names_prefix = "es_", values_to = "es") %>%
pivot_longer(cols = starts_with("pwr"), names_to = "pwr", names_prefix = "pwr_") %>%
filter(substr(type, 1, 3) == substr(pwr, 1, 3)) %>%
mutate(pwr = parse_number(pwr)) %>%
arrange(pwr, es, type)
output
type es pwr value
1 independence 0.1 1200 0.08
2 issue_owner 0.1 1200 0.087
3 party 0.1 1200 0.081
4 independence 0.2 1200 0.273
5 issue_owner 0.2 1200 0.235
6 party 0.2 1200 0.157
7 independence 0.1 2400 0.163
8 issue_owner 0.1 2400 0.13
9 party 0.1 2400 0.102
10 independence 0.2 2400 0.513
11 issue_owner 0.2 2400 0.406
12 party 0.2 2400 0.267
Upvotes: 4