Achal Neupane
Achal Neupane

Reputation: 5719

Substitute the pattern of Nth word of a sentence in R

Suppose I have my sentence txt2 <- "useRs may fly into JFK or laGuardia"

I could capitalize the first and last letters of the given sentence, and for first word as:

sub("(\\w)(\\w*)(\\w)", "\\U\\1\\E\\2\\U\\3", txt2, perl=TRUE)

and the last word as:

sub("(\\w)(\\w*)(\\w)+$", "\\U\\1\\E\\2\\U\\3", txt2, perl=TRUE)

What would be the trick to capitalize the third word as FlY using the similar concept in R?

Upvotes: 3

Views: 57

Answers (2)

M--
M--

Reputation: 28825

Another approach which I, myself, consider not as half robust as @Wiktor's answer:

txt2 <- "useRs may fly into JFK or laGuardia"
n <- 4

gsub(paste0('^(\\s*(?:\\S+\\s+){',n-1,'})\\S+'),
     paste0("\\1",gsub("(\\w)(\\w*)(\\w)", "\\U\\1\\E\\2\\U\\3", 
                       unlist(strsplit(txt2, split=" "))[n], 
                       perl = TRUE)),
    txt2)

 # [1] "useRs may fly IntO JFK or laGuardia"

This replaces the nth word with its capitalized instance (first and last letter) while Wiktor's answer directly does the job.

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

You can use

txt2 <- "useRs may fly into JFK or laGuardia"
id <- 3
sub(paste0("((?:\\w+\\W+){", id-1, "})(\\w)(\\w*)(\\w)"), "\\1\\U\\2\\E\\3\\U\\4", txt2, perl=TRUE)
## => [1] "useRs may FlY into JFK or laGuardia"

See the R demo online. Also, see the regex demo.

Note that sub replaces the first match only. The ((?:\w+\W+){2})(\w)(\w*)(\w)pattern matches

  • ((?:\w+\W+){2}) - Group 1: two occurrences of 1+ word chars followed with 1+ non-word chars -(\w) - Group 2: start word char of a word to be processed
  • (\w*) - Group 3: the middle of the word to be processed
  • (\w) - Group 4: last word char of a word to be processed.

Upvotes: 3

Related Questions