Paul
Paul

Reputation: 2959

How can I pivot_longer() while maintaining column pairings?

There's got to be a simpler way to do this!

I start with wide format data:

| family_id | first_name_child1 | surname_child1 | first_name_child2 | second_name_child2 | ... |
|...........|...................|................|...................|....................|.....|

And I want to turn it into long format:

| family_id | sibling_number | first_name | surname |
|...........|................|............|.........|

Question: How can I pivot_longer() while maintaining the first name/surname pairings?

This is how I did it:

df <- structure(list(family_id = 1:2, first_name_child1 = c("Verdie", 
"Quentin"), first_name_child2 = c("Iris", "Bryon"), first_name_child3 = c(NA, 
"Karie"), first_name_child4 = c(NA, "Christopher"), surname_child1 = c("Moy", 
"Mccowen"), surname_child2 = c("Moy", "Mccowen"), surname_child3 = c(NA, 
"Mccowen"), surname_child4 = c(NA, "Mccowen")), row.names = c(NA, 
-2L), class = c("tbl_df", "tbl", "data.frame"))

library(dplyr)
library(tidyr)

fun <- function(x) {
    names(x) <- gsub("_child\\d+", "", names(x))
    x
}

df %>%
    nest(child1 = ends_with("_child1"),
         child2 = ends_with("_child2"),
         child3 = ends_with("_child3"),
         child4 = ends_with("_child4")) %>%
    mutate_at(vars(starts_with("child")), lapply, fun) %>%
    pivot_longer(-family_id, names_to = "sibling_number",
                 names_prefix = "child",
                 values_to = "name") %>%
    unnest(name)

BUT I can do the reverse with 1 line:

df2 <- structure(list(family_id = c(1L, 1L, 2L, 2L, 2L, 2L), sibling_number = c(1L, 
2L, 1L, 2L, 3L, 4L), first_name = c("Verdie", "Iris", "Quentin", 
"Bryon", "Karie", "Christopher"), surname = c("Moy", "Moy", "Mccowen", 
"Mccowen", "Mccowen", "Mccowen")), row.names = c(NA, -6L), class = c("tbl_df", 
"tbl", "data.frame"))

pivot_wider(df2,
            names_from = sibling_number,
            names_prefix = "child",
            values_from = c("first_name", "surname"))

Is this pivot_wider() easily reversible? Or alternatively, I thought there might be a way to combine do.call(), nest() and ends_with(), but couldn't work it out?

Upvotes: 0

Views: 100

Answers (3)

Paul
Paul

Reputation: 2959

This answer is copied from Henrik's comment:

pivot_longer(df, cols = -1, names_to = c(".value", "sibling_nr"), names_sep = "child")

Answer is posted and accepted to close out question.

Upvotes: 0

Yuriy Saraykin
Yuriy Saraykin

Reputation: 8880

additional solution

df %>% 
  pivot_longer(cols = -family_id, 
               names_to = c(".value", "set"),
               names_pattern = "(.*)(\\d+)")

Upvotes: 2

desval
desval

Reputation: 2435

The data.table solution would be:

library(data.table)
g <- melt(setDT(df),
          id.vars = "family_id",
          measure.vars = patterns(first_name = "first_name_child",
                                  surname = "surname_child"),
          variable.name = "sibling_number",
          na.rm = F)

g[order(family_id)]
   family_id sibling_number  first_name surname
1:         1              1      Verdie     Moy
2:         1              2        Iris     Moy
3:         1              3        <NA>    <NA>
4:         1              4        <NA>    <NA>
5:         2              1     Quentin Mccowen
6:         2              2       Bryon Mccowen
7:         2              3       Karie Mccowen
8:         2              4 Christopher Mccowen

And, as a side note, you can get it back to wide format with

dcast(g,
 family_id ~ sibling_number,
 value.var = c("first_name","surname"))

Upvotes: 1

Related Questions