Reputation: 209
I'm looking to use separate_rows() to tidy data, but my data does not have a delimiter. Instead, I want to "separate" by each individual character. Because there is no delimiter in the data, I'm not sure what can be put in the sep=
option.
My data is set up like:
cog func
COG0115 EH
COG0117 H
COG0119 E
COG0124 J
COG0126 G
COG0129 EG
I've tried:
df %>% separate_rows(., 'func', sep='[A-Z]')
But I realize this is telling the function to consider each capital letter a "delimiter" and is definitely not what I want as it results in an empty column...
Instead I am looking for:
cog func
COG0115 E
COG0115 H
COG0117 H
COG0119 E
COG0124 J
COG0126 G
COG0129 E
COG0129 G
Upvotes: 1
Views: 367
Reputation: 886938
A regex lookaround can be used as sep
.
library(dplyr)
library(tidyr)
df %>%
separate_rows(func, sep = '(?<=.)(?=.)')
# cog func
#1 COG0115 E
#2 COG0115 H
#3 COG0117 H
#4 COG0119 E
#5 COG0124 J
#6 COG0126 G
#7 COG0129 E
#8 COG0129 G
df <- structure(list(cog = c("COG0115", "COG0117", "COG0119", "COG0124",
"COG0126", "COG0129"), func = c("EH", "H", "E", "J", "G", "EG"
)), class = "data.frame", row.names = c(NA, -6L))
Upvotes: 2