Reputation: 526
I have a colour column that sometimes contains a single colour reference, and sometimes contains multiple colours, which are separated by "||"
library(tidyverse)
id <- c(1:10)
colour <- c("sky-blue","blood-red","lavender-purple",
"sky-blue||blood-red", "midnight-blue", "blood-red||lavender-purple||sky-blue",
"grass-green","sky-blue||blood-red||lavender-purple||midnight-blue",
"grass-green","grass-green||midnight-blue")
df <- tibble("id" = id,
"colour" = colour)
# A tibble: 10 × 2
id colour
<int> <chr>
1 1 sky-blue
2 2 blood-red
3 3 lavender-purple
4 4 sky-blue||blood-red
5 5 midnight-blue
6 6 blood-red||lavender-purple||sky-blue
7 7 grass-green
8 8 sky-blue||blood-red||lavender-purple||midnight-blue
9 9 grass-green
10 10 grass-green||midnight-blue
I would like to separate those colours into individual columns, such that each column only contains one colour, then I would like to stack the colours with duplicate ids. (using gather()
) The name of the new colour columns is rather irrelevant, so I went for "col_1", "col_2", etc. , since I will then stack them again. However, if I run separate()
, it does the following:
df %>%
separate(colour, into = c("col_1","col_2","col_3","col_4"), sep = "||")
# A tibble: 10 × 5
id col_1 col_2 col_3 col_4
<int> <chr> <chr> <chr> <chr>
1 1 "" s k y
2 2 "" b l o
3 3 "" l a v
4 4 "" s k y
5 5 "" m i d
6 6 "" b l o
7 7 "" g r a
8 8 "" s k y
9 9 "" g r a
10 10 "" g r a
This also happens if I run it on a single row with the exact right number of columns in the into=
I have looked at some solutions, but haven't found something that covers irregular separator occurrence, and irregular expression length. Any solution would be most welcome.
Upvotes: 0
Views: 35
Reputation: 2246
You need to escape special symbols with \\
.
So try:
df %>%
separate_rows(colour, sep = "\\|\\|") %>%
mutate(ind = 1) %>%
pivot_wider(
names_from = colour,
values_from = ind,
values_fill = 0
)
Output is:
# A tibble: 10 × 6
id `sky-blue` `blood-red` `lavender-purple` `midnight-blue` `grass-green`
<int> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 0 0 0 0
2 2 0 1 0 0 0
3 3 0 0 1 0 0
4 4 1 1 0 0 0
5 5 0 0 0 1 0
6 6 1 1 1 0 0
7 7 0 0 0 0 1
8 8 1 1 1 1 0
9 9 0 0 0 0 1
10 10 0 0 0 1 1
Upvotes: 2