Roshan
Roshan

Reputation: 89

Restructuring data tables in R, convert to row to column if repeated

I have table with details of different families. Each row has the name of a family member. I wanted to restructure my table by bringing the details for all the family members in the singe row. Please see my example tables below.

    Family_no <- c(01,02,02,03,04,05,05,05,04)
name <- c('n1','n2','n3','n4','n5','n6','n7','n8','n9')
gender  <- c('m','f','m','m','f','f','f','f','f')
age <- c(35,35,32,2,4,6,7,11,9)
relation <- c(1,2,1,2,4,6,7,8,1)
df <- data.frame(Family_no,name,gender,age,relation)

 Family_no name gender age relation
1         1   n1      m  35        1
2         2   n2      f  35        2
3         2   n3      m  32        1
4         3   n4      m   2        2
5         4   n5      f   4        4
6         5   n6      f   6        6
7         5   n7      f   7        7
8         5   n8      f  11        8
9         4   n9      f   9        1

I am looking for an outcome similar to the table given below. Could you please suggest me how to go about.

Family_no name gender age relation name gender age relation name gender age relation
1 n1 m 35 1
2 n2 f 35 2 n3 m 32 1
3 n4 m 2 2
4 n5 f 4 4 n9 f 9 1
5 n6 f 6 6 n7 f 7 7 n8 f 7 8

Upvotes: 0

Views: 34

Answers (1)

m0nhawk
m0nhawk

Reputation: 24168

You can use pivot_wider from tidyr to get the result you want.

df_new <- df %>%
  group_by(Family_no) %>%
  mutate(row = row_number())

df_new2 <- df_new %>% 
  pivot_wider(
    names_from = row, 
    values_from = c(name, gender, age, relation)
  )

This would be the result:

> df_new2
# A tibble: 5 × 13
# Groups:   Family_no [5]
  Family_no name_1 name_2 name_3 gender_1 gender_2 gender_3 age_1 age_2 age_3 relation_1 relation_2 relation_3
      <dbl> <chr>  <chr>  <chr>  <chr>    <chr>    <chr>    <dbl> <dbl> <dbl>      <dbl>      <dbl>      <dbl>
1         1 n1     NA     NA     m        NA       NA          35    NA    NA          1         NA         NA
2         2 n2     n3     NA     f        m        NA          35    32    NA          2          1         NA
3         3 n4     NA     NA     m        NA       NA           2    NA    NA          2         NA         NA
4         4 n5     n9     NA     f        f        NA           4     9    NA          4          1         NA
5         5 n6     n7     n8     f        f        f            6     7    11          6          7          8

If the order of the columns is important, I can suggest something like this (not sure if it can be done simpler with pivot_wider directly):

cols <- grep("^(name|gender|age|relation)_", names(df_new), value = TRUE)
n <- length(cols)

df_new3 <- df_new2 %>%
  select(Family_no, 
         matches(sprintf("^(name|gender|age|relation)_%d$", 1:n)))

With result:

> df_new3
# A tibble: 5 × 13
# Groups:   Family_no [5]
  Family_no name_1 gender_1 age_1 relation_1 name_2 gender_2 age_2 relation_2 name_3 gender_3 age_3 relation_3
      <dbl> <chr>  <chr>    <dbl>      <dbl> <chr>  <chr>    <dbl>      <dbl> <chr>  <chr>    <dbl>      <dbl>
1         1 n1     m           35          1 NA     NA          NA         NA NA     NA          NA         NA
2         2 n2     f           35          2 n3     m           32          1 NA     NA          NA         NA
3         3 n4     m            2          2 NA     NA          NA         NA NA     NA          NA         NA
4         4 n5     f            4          4 n9     f            9          1 NA     NA          NA         NA
5         5 n6     f            6          6 n7     f            7          7 n8     f           11          8

Upvotes: 1

Related Questions