Reputation: 6155
I am using the palmer penguins data for this reproducible example. For a simple example, I am keeping the data for the year 2007 and the species Adelie
:
install.packages("palmerpenguins")
library(palmerpenguins)
library(tidyverse)
ad <- penguins %>%
filter(species == "Adelie",
year == 2007) %>%
select(species, island)
I want to create individual columns for row numbers for each island
in the data. There are 3 islands:
> unique(ad$island)
[1] Torgersen Biscoe Dream
Levels: Biscoe Dream Torgersen
So, there should be 3 row id columns.
I have done following that works:
ad %>%
add_count(island, name = "island_counts") %>%
group_by(island) %>%
mutate(row_id = 0:(unique(island_counts)-1),
row_id_Torgersen = if_else(island == "Torgersen", row_id, NA_integer_),
row_id_Biscoe = if_else(island == "Biscoe", row_id, NA_integer_),
row_id_Dream = if_else(island == "Dream" , row_id, NA_integer_)
) %>%
ungroup()
# A tibble: 50 x 7
species island island_counts row_id row_id_Torgersen row_id_Biscoe row_id_Dream
<fct> <fct> <int> <int> <int> <int> <int>
1 Adelie Torgersen 20 0 0 NA NA
2 Adelie Torgersen 20 1 1 NA NA
3 Adelie Torgersen 20 2 2 NA NA
4 Adelie Torgersen 20 3 3 NA NA
5 Adelie Torgersen 20 4 4 NA NA
6 Adelie Torgersen 20 5 5 NA NA
7 Adelie Torgersen 20 6 6 NA NA
8 Adelie Torgersen 20 7 7 NA NA
9 Adelie Torgersen 20 8 8 NA NA
10 Adelie Torgersen 20 9 9 NA NA
# ... with 40 more rows
But I want to do this programmatically, so that the number of new columns are not hard coded, but generated based on the number of islands
. Is there a better solution? Thank you.
Upvotes: 0
Views: 26
Reputation: 39595
Try this. You can duplicate the island variable and then reshape to wide using pivot_wider()
. In this way you can avoid creating individual conditionals:
library(palmerpenguins)
library(tidyverse)
ad <- penguins %>%
filter(species == "Adelie",
year == 2007) %>%
select(species, island)
#Code
ad %>%
add_count(island, name = "island_counts") %>%
group_by(island) %>%
mutate(row_id = 0:(unique(island_counts)-1),name=paste0('row_id_',island),
id=row_number()) %>%
ungroup() %>%
pivot_wider(names_from = name,values_from=row_id)%>%
select(-id)
Output:
# A tibble: 50 x 6
species island island_counts row_id_Torgersen row_id_Biscoe row_id_Dream
<fct> <fct> <int> <int> <int> <int>
1 Adelie Torgersen 20 0 NA NA
2 Adelie Torgersen 20 1 NA NA
3 Adelie Torgersen 20 2 NA NA
4 Adelie Torgersen 20 3 NA NA
5 Adelie Torgersen 20 4 NA NA
6 Adelie Torgersen 20 5 NA NA
7 Adelie Torgersen 20 6 NA NA
8 Adelie Torgersen 20 7 NA NA
9 Adelie Torgersen 20 8 NA NA
10 Adelie Torgersen 20 9 NA NA
# ... with 40 more rows
Upvotes: 1