Splitting one Column into two based upon when a specific character shows up

Question

I have a dataset that includes a column with the following information:

Kay Ivey (R)

Mike Dunleavy (R)

Doug Ducey (R)...

Basically, the name of a Governor with the political party in parentheses next to it. How can I split the columnn into two with the name in one column and the Political Party designation in another. I have tried using the separate() function, but cannot figure out how to accomplish this goal.

akrun · Accepted Answer

We can use extract to capture substring as a group ((...)) to create two columns i.e. capture all characters until the (, then capture the letter as second group

library(tidyr)
extract(df1, name, into = c("name", "designation"), "(.*)$([^)]+)$")
#            name designation
#1      Kay Ivey            R
#2 Mike Dunleavy            R
#3    Doug Ducey            R

Or with separate by specifying the sep as zero or more spaces (\s*) followed by the bracket ($ - escaped as it is a metacharacter) or the $ and specify to "drop" the extra column (otherwise there would be a friendly warning)

df1 %>% 
  separate(name, into = c('name', 'designation'),
          sep="\s*$|$$", extra = "drop")
 #          name designation
 #1      Kay Ivey           R
 #2 Mike Dunleavy           R
 #3    Doug Ducey           R

Or in base R with read.csv after creating a delimiter while replacing the () using gsub

read.csv(text = gsub("\s$([^)]+)$", ",\1", df1$name), 
     header = FALSE, col.names = c('name', 'designation'))
#          name designation
#1      Kay Ivey           R
#2 Mike Dunleavy           R
#3    Doug Ducey           R

data

df1 <- structure(list(name = c("Kay Ivey (R)", "Mike Dunleavy (R)", 
"Doug Ducey (R)")), class = "data.frame", row.names = c(NA, -3L
))

Splitting one Column into two based upon when a specific character shows up

Answers (2)

data

Related Questions