Ben
Ben

Reputation: 1

How do I pivot a column of interest into multiple columns while merging matching year-ID-covariate rows into one in R?

I'm trying to pivot a dataset with many variables and observations in R.

Each row is an ID-Year-Outcome triplet. Each ID-Year-Brightness triplet has one or more corresponding colors that change over time.

ID Year Color Brightness
1 2001 Red High
1 2001 Blue High
1 2002 Red Medium
1 2002 Green Medium
2 2002 Blue Low
2 2003 Blue Medium
2 2003 Yellow Medium
3 2003 Green High
3 2004 Orange Low

I want to merge each ID-Year-Brightness triplet into a singular row each and spread out the 'Color' variable amongst multiple columns.

I want it to look like this:

ID Year Color 1 Color 2 Brightness
1 2001 Red Blue High
1 2002 Red Green Medium
2 2002 Blue Low
2 2003 Blue Yellow Medium
3 2003 Green High
3 2004 Orange Low

I've tried using pivot_wider and dcast and cSplit and tidyr::spread, but nothing's worked for me so far unfortunately.

Any help would be greatly appreciated!

Upvotes: 0

Views: 29

Answers (1)

Till
Till

Reputation: 6663

tidyr::pivot_wider() can produce your desired output. First we need to add a column that identifies the row number for each combination of ID and Year.

library(tidyverse)

df |>
  group_by(ID, Year) |>
  mutate(name = row_number()) |>
  pivot_wider(id_cols = c(ID, Year),
              values_from = c(Color, Brightness))
#> # A tibble: 6 × 6
#> # Groups:   ID, Year [6]
#>      ID  Year Color_1 Color_2 Brightness_1 Brightness_2
#>   <int> <int> <chr>   <chr>   <chr>        <chr>       
#> 1     1  2001 Red     Blue    High         High        
#> 2     1  2002 Red     Green   Medium       Medium      
#> 3     2  2002 Blue    <NA>    Low          <NA>        
#> 4     2  2003 Blue    Yellow  Medium       Medium      
#> 5     3  2003 Green   <NA>    High         <NA>        
#> 6     3  2004 Orange  <NA>    Low          <NA>

Upvotes: 0

Related Questions