Reputation: 1335
As the title says. I have a bunch of names and I need to add a comma after the first word that starts with a capital letter.
An example:
txt <- c( "de Van-Smith J", "van der Smith G.H.", "de Smith JW", "Smith JW")
The result should be:
[1] "de Van-Smith, J" "van der Smith, G.H." "de Smith, JW" "Smith, JW"
I have mainly been trying to use gsub()
and stringr::str_replace()
, but am stuggling with the regex, any advice would be appreciated.
Upvotes: 2
Views: 324
Reputation: 102339
Another sub
option
> sub("([A-Z].*)(?=\\s)", "\\1,", txt, perl = TRUE)
[1] "de Van-Smith, J" "van der Smith, G.H." "de Smith, JW"
[4] "Smith, JW"
Upvotes: 1
Reputation: 887571
We can use
sub('\\b([A-Z]\\S+)', "\\1,", txt)
[1] "de Van-Smith, J" "van der Smith, G.H." "de Smith, JW" "Smith, JW"
Upvotes: 2
Reputation: 389175
You can use -
sub("([A-Z][\\w-]+)", "\\1,", txt, perl = TRUE)
#[1] "de Van-Smith, J" "van der Smith, G.H." "de Smith, JW" "Smith, JW"
where ([A-Z][\\w-]+)
captures a word which starts with upper case letter and has -
or any number of word characters following it.
Upvotes: 3