flee
flee

Reputation: 1335

Add comma after first word starting with a capital letter

As the title says. I have a bunch of names and I need to add a comma after the first word that starts with a capital letter.

An example:

txt <- c( "de Van-Smith J", "van der Smith G.H.", "de Smith JW", "Smith JW")

The result should be:

[1] "de Van-Smith, J" "van der Smith, G.H." "de Smith, JW" "Smith, JW"  

I have mainly been trying to use gsub() and stringr::str_replace(), but am stuggling with the regex, any advice would be appreciated.

Upvotes: 2

Views: 324

Answers (3)

ThomasIsCoding
ThomasIsCoding

Reputation: 102339

Another sub option

> sub("([A-Z].*)(?=\\s)", "\\1,", txt, perl = TRUE)
[1] "de Van-Smith, J"     "van der Smith, G.H." "de Smith, JW"
[4] "Smith, JW"

Upvotes: 1

akrun
akrun

Reputation: 887571

We can use

sub('\\b([A-Z]\\S+)', "\\1,", txt)
[1] "de Van-Smith, J"     "van der Smith, G.H." "de Smith, JW"        "Smith, JW"          

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 389175

You can use -

sub("([A-Z][\\w-]+)", "\\1,", txt, perl = TRUE)

#[1] "de Van-Smith, J"   "van der Smith, G.H." "de Smith, JW"       "Smith, JW"

where ([A-Z][\\w-]+) captures a word which starts with upper case letter and has - or any number of word characters following it.

Upvotes: 3

Related Questions