tinyteeth
tinyteeth

Reputation: 209

Can separate_rows() separate by individual character?

I'm looking to use separate_rows() to tidy data, but my data does not have a delimiter. Instead, I want to "separate" by each individual character. Because there is no delimiter in the data, I'm not sure what can be put in the sep= option.

My data is set up like:

    cog   func
COG0115    EH
COG0117    H
COG0119    E
COG0124    J
COG0126    G
COG0129    EG

I've tried:

df %>% separate_rows(., 'func', sep='[A-Z]') 

But I realize this is telling the function to consider each capital letter a "delimiter" and is definitely not what I want as it results in an empty column...

Instead I am looking for:

    cog   func
COG0115    E
COG0115    H
COG0117    H
COG0119    E
COG0124    J
COG0126    G
COG0129    E
COG0129    G

Upvotes: 1

Views: 367

Answers (1)

akrun
akrun

Reputation: 886938

A regex lookaround can be used as sep.

library(dplyr)
library(tidyr)
df %>% 
   separate_rows(func, sep = '(?<=.)(?=.)')
#       cog func
#1 COG0115    E
#2 COG0115    H
#3 COG0117    H
#4 COG0119    E
#5 COG0124    J
#6 COG0126    G
#7 COG0129    E
#8 COG0129    G

data

df <- structure(list(cog = c("COG0115", "COG0117", "COG0119", "COG0124", 
"COG0126", "COG0129"), func = c("EH", "H", "E", "J", "G", "EG"
)), class = "data.frame", row.names = c(NA, -6L))

Upvotes: 2

Related Questions