Picataro
Picataro

Reputation: 151

How to turn data from long to wide format so that duplicate rows get added to the end to make new columns in R?

I got the following data.

 
ID  NAME   STATUS  OKR_T  OKR_N  NR
1   Jack     1      34     OK1    0
1   Jack     1     433     OK2    0
1   Jack     1      12     OK3    1
2   Bill     2      34     OK1    1
3   Steve    1     433     OK2    1
3   Steve    1      34     OK1    0
3   Steve    1      45     OK4    0

I need to get this:

 
ID  NAME   STATUS  OKR_T.x  OKR_N.x  NR.x  OKR_T.y  OKR_N.y   NR.y OKR_T.z  OKR_N.z  NR.z
1   Jack     1      34      OK1       0     433       OK2      0     12       OK3     1
2   Bill     2      34      OK1       1
3   Steve    1     433      OK2       1      34       OK1      0     45       OK4     0      

I put the .x ; .y and .z to the end of the column names because if i remember correctly this is how R names columns with the same name. It's not important to me exactly how the duplicate columns are named as long as i can tell the difference that what OKR_T ; OKR_N and NR came from the same original row.

I have limited expirience in R and so far i have yoused dcast to reshape my data from long to wide but i can't get it to work with it. Since the logic is different i think that i need another way to do it.

Thank you for your help :)

Upvotes: 0

Views: 694

Answers (1)

Yuriy Saraykin
Yuriy Saraykin

Reputation: 8880

library(tidyverse)
df <- read.table(
  text = "ID  NAME   STATUS  OKR_T  OKR_N  NR
1   Jack     1      34     OK1    0
1   Jack     1     433     OK2    0
1   Jack     1      12     OK3    1
2   Bill     2      34     OK1    1
3   Steve    1     433     OK2    1
3   Steve    1      34     OK1    0
3   Steve    1      45     OK4    0",
  header = T
)
df %>%
  group_by(ID) %>%
  mutate(rid = row_number()) %>%
  pivot_wider(
    id_cols = c(ID, NAME, STATUS),
    names_from = rid,
    values_from = c(OKR_T, OKR_N, NR)
  )
#> # A tibble: 3 x 12
#> # Groups:   ID [3]
#>      ID NAME  STATUS OKR_T_1 OKR_T_2 OKR_T_3 OKR_N_1 OKR_N_2 OKR_N_3  NR_1  NR_2
#>   <int> <chr>  <int>   <int>   <int>   <int> <chr>   <chr>   <chr>   <int> <int>
#> 1     1 Jack       1      34     433      12 OK1     OK2     OK3         0     0
#> 2     2 Bill       2      34      NA      NA OK1     <NA>    <NA>        1    NA
#> 3     3 Steve      1     433      34      45 OK2     OK1     OK4         1     0
#> # ... with 1 more variable: NR_3 <int>

Created on 2021-09-08 by the reprex package (v2.0.1)

Upvotes: 1

Related Questions