Reputation: 1
I'm trying to pivot a dataset with many variables and observations in R.
Each row is an ID-Year-Outcome triplet. Each ID-Year-Brightness triplet has one or more corresponding colors that change over time.
ID | Year | Color | Brightness |
---|---|---|---|
1 | 2001 | Red | High |
1 | 2001 | Blue | High |
1 | 2002 | Red | Medium |
1 | 2002 | Green | Medium |
2 | 2002 | Blue | Low |
2 | 2003 | Blue | Medium |
2 | 2003 | Yellow | Medium |
3 | 2003 | Green | High |
3 | 2004 | Orange | Low |
I want to merge each ID-Year-Brightness triplet into a singular row each and spread out the 'Color' variable amongst multiple columns.
I want it to look like this:
ID | Year | Color 1 | Color 2 | Brightness |
---|---|---|---|---|
1 | 2001 | Red | Blue | High |
1 | 2002 | Red | Green | Medium |
2 | 2002 | Blue | Low | |
2 | 2003 | Blue | Yellow | Medium |
3 | 2003 | Green | High | |
3 | 2004 | Orange | Low |
I've tried using pivot_wider and dcast and cSplit and tidyr::spread, but nothing's worked for me so far unfortunately.
Any help would be greatly appreciated!
Upvotes: 0
Views: 29
Reputation: 6663
tidyr::pivot_wider()
can produce your desired output. First we need to add
a column that identifies the row number for each combination of ID
and Year
.
library(tidyverse)
df |>
group_by(ID, Year) |>
mutate(name = row_number()) |>
pivot_wider(id_cols = c(ID, Year),
values_from = c(Color, Brightness))
#> # A tibble: 6 × 6
#> # Groups: ID, Year [6]
#> ID Year Color_1 Color_2 Brightness_1 Brightness_2
#> <int> <int> <chr> <chr> <chr> <chr>
#> 1 1 2001 Red Blue High High
#> 2 1 2002 Red Green Medium Medium
#> 3 2 2002 Blue <NA> Low <NA>
#> 4 2 2003 Blue Yellow Medium Medium
#> 5 3 2003 Green <NA> High <NA>
#> 6 3 2004 Orange <NA> Low <NA>
Upvotes: 0