Reputation: 96
I currently have a dataframe of imported CSV data. It's a list of first and last names, jobs titles, and company name. Each entry is on a separate row. The first and last names, job title, and company name are all capitalized.
Each row is in this format:
First LastTitle, Company
I want to insert a comma delimiter before "Title", so that I can then sort the data into three columns, like the second answer on this quesetion: splitting comma separated mixed text and numeric string with strsplit in R.
Essentially, in this specific case I want to locate the 3rd uppercase letter in each string, and then insert a comma delimiter before it.
This answer shows how to split a string on uppercase letters, but seems to only find the first uppercase letter: Splitting String based on letters case.
Any suggestions are appreciated.
Upvotes: 2
Views: 1536
Reputation: 4554
Try this:
gsub('([a-z])(?=[A-Z])','\\1,',str,perl=T)
[1] "First Last,Title, Company"
Upvotes: 1
Reputation: 19857
You could insert a comma after two patterns of one uppercase-several none uppercase character :
x <- "First LastTitle, Company"
sub("(([A-Z][^A-Z]+){2})(.*)","\\1,\\3",x)
[1] "First Last,Title, Company"
Upvotes: 0
Reputation: 1095
Split the string into character vector and then use grep
to find the positions of the upper case letters, then take the third position.
str <- "First LastTitle, Company"
tmp_str <- unlist(strsplit(str, ""))
ind <- grep("[A-Z]", tmp_str)[3]
paste0(c(tmp_str[1:(ind-1)], ",", tmp_str[ind:nchar(str)]), collapse="")
#[1] "First Last,Title, Company"
Upvotes: 3