Reputation: 2959
There's got to be a simpler way to do this!
I start with wide format data:
| family_id | first_name_child1 | surname_child1 | first_name_child2 | second_name_child2 | ... |
|...........|...................|................|...................|....................|.....|
And I want to turn it into long format:
| family_id | sibling_number | first_name | surname |
|...........|................|............|.........|
Question: How can I pivot_longer()
while maintaining the first name/surname pairings?
This is how I did it:
df <- structure(list(family_id = 1:2, first_name_child1 = c("Verdie",
"Quentin"), first_name_child2 = c("Iris", "Bryon"), first_name_child3 = c(NA,
"Karie"), first_name_child4 = c(NA, "Christopher"), surname_child1 = c("Moy",
"Mccowen"), surname_child2 = c("Moy", "Mccowen"), surname_child3 = c(NA,
"Mccowen"), surname_child4 = c(NA, "Mccowen")), row.names = c(NA,
-2L), class = c("tbl_df", "tbl", "data.frame"))
library(dplyr)
library(tidyr)
fun <- function(x) {
names(x) <- gsub("_child\\d+", "", names(x))
x
}
df %>%
nest(child1 = ends_with("_child1"),
child2 = ends_with("_child2"),
child3 = ends_with("_child3"),
child4 = ends_with("_child4")) %>%
mutate_at(vars(starts_with("child")), lapply, fun) %>%
pivot_longer(-family_id, names_to = "sibling_number",
names_prefix = "child",
values_to = "name") %>%
unnest(name)
BUT I can do the reverse with 1 line:
df2 <- structure(list(family_id = c(1L, 1L, 2L, 2L, 2L, 2L), sibling_number = c(1L,
2L, 1L, 2L, 3L, 4L), first_name = c("Verdie", "Iris", "Quentin",
"Bryon", "Karie", "Christopher"), surname = c("Moy", "Moy", "Mccowen",
"Mccowen", "Mccowen", "Mccowen")), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"))
pivot_wider(df2,
names_from = sibling_number,
names_prefix = "child",
values_from = c("first_name", "surname"))
Is this pivot_wider()
easily reversible? Or alternatively, I thought there might be a way to combine do.call()
, nest()
and ends_with()
, but couldn't work it out?
Upvotes: 0
Views: 100
Reputation: 2959
This answer is copied from Henrik's comment:
pivot_longer(df, cols = -1, names_to = c(".value", "sibling_nr"), names_sep = "child")
Answer is posted and accepted to close out question.
Upvotes: 0
Reputation: 8880
additional solution
df %>%
pivot_longer(cols = -family_id,
names_to = c(".value", "set"),
names_pattern = "(.*)(\\d+)")
Upvotes: 2
Reputation: 2435
The data.table solution would be:
library(data.table)
g <- melt(setDT(df),
id.vars = "family_id",
measure.vars = patterns(first_name = "first_name_child",
surname = "surname_child"),
variable.name = "sibling_number",
na.rm = F)
g[order(family_id)]
family_id sibling_number first_name surname
1: 1 1 Verdie Moy
2: 1 2 Iris Moy
3: 1 3 <NA> <NA>
4: 1 4 <NA> <NA>
5: 2 1 Quentin Mccowen
6: 2 2 Bryon Mccowen
7: 2 3 Karie Mccowen
8: 2 4 Christopher Mccowen
And, as a side note, you can get it back to wide format with
dcast(g,
family_id ~ sibling_number,
value.var = c("first_name","surname"))
Upvotes: 1