Reputation: 588

Index of last whitespace in each item of a character vector

I have got a character vector x as

 [1] "Mt. Everest" "Cho oyu" "Mont Blanc" "Ojos del Salado"

And I am looking for an output giving me the index of last white-space

[1] 4 4 5 9

I believe I need to use sapply so that my function applies to each item in the vector, however unable to write that:

sapply(x,myFunction)

For myFunction I write something like:

myFunction <- function(a){
match(a,c(" "))
}

which understandably gives all NA as no item is a space only.

I dont want to use stringr for this.

Upvotes: 0

Answers (6)

boski

Reputation: 2467

A simple and concise alternative

sapply(a,function(x){last(which(strsplit(x,"")[[1]]==" "))})

    Mt. Everest         Cho oyu      Mont Blanc Ojos del Salado 
              4               4               5               9

Upvotes: 0

tmfmnk

Reputation: 39858

You can also try grepRaw():

sapply(x, function(x) max(grepRaw(" ", x, all = TRUE)))

Mt. Everest         Cho oyu      Mont Blanc Ojos del Salado 
          4               4               5               9

With dplyr:

data.frame(x) %>%
 mutate(res = sapply(x, function(x) max(grepRaw(" ", x, all = TRUE))))

                x res
1     Mt. Everest   4
2         Cho oyu   4
3      Mont Blanc   5
4 Ojos del Salado   9

Upvotes: 0

Ronak Shah

Reputation: 388992

One way using mapply is to split the characters on whitespace, calculate the number of characters of last element and subtract it from the total characters of the string.

myFunction <- function(a){
  mapply(function(p, q) q - nchar(p[length(p)]), strsplit(a, "\\s+"), nchar(a))
}  

myFunction(x)
#[1] 4 4 5 9

How it works :

Let's take the last element from the list :

x <- "Ojos del Salado"

#Split on whitespace
p = strsplit(x, "\\s+")[[1]]
p
#[1] "Ojos"   "del"    "Salado"

#Select the last element 
p[length(p)]
#[1] "Salado"

#Count the number of characters in the last element
nchar(p[length(p)])
#[1] 6

#Subtract it from total characters in x
nchar(x) - nchar(p[length(p)])
#[1] 9

data

x <- c("Mt. Everest", "Cho oyu" ,"Mont Blanc", "Ojos del Salado")

Upvotes: 1

s_baldur

Reputation: 33488

Using stringr:

library(stringr)
myFunction <- function(a){
  str_locate(a, " (?=[^ ]*$)")[, 1]
}

myFunction(x)
# [1] 4 4 5 9

Using stringi (and avoiding regex):

library(stringi)
myFunction2 <- function(a){
  stri_locate_last_fixed(a, " ")[, 1]
}

myFunction2(x)
# [1] 4 4 5 9

Using strsplit() from base R (and avoiding regex also):

myFunction3 <- function(a){
  sapply(strsplit(x, ""), function(x) max(which(x == " ")))
}

myFunction3(x)
# [1] 4 4 5 9

Data:

x <- c("Mt. Everest", "Cho oyu", "Mont Blanc", "Ojos del Salado")

Upvotes: 0

NM_

Reputation: 1999

You can achieve this using gregexpr

x = c("Mt. Everest", "Cho oyu", "Mont Blanc", "Ojos del Salado")

lapply(gregexpr(pattern=" ", x), max)

If you would like your answer as a vector

> sapply(gregexpr(pattern=" ", x), max)
[1] 4 4 5 9

Credit: Answer was improved with help of @markus

Upvotes: 1

Wimpel

Reputation: 27732

regexpr will do...

v <- c("Mt. Everest", "Cho oyu", "Mont Blanc", "Ojos del Salado")

#find position of space, not followed by a space until the end of string    
regexpr(" [^ ]*$", v)

#int [1:4] 4 4 5 9

library(dplyr)
data.frame( v = v ) %>% mutate( lastspace = regexpr(" [^ ]*$", v) )

#                 v lastspace
# 1     Mt. Everest         4
# 2         Cho oyu         4
# 3      Mont Blanc         5
# 4 Ojos del Salado         9

Upvotes: 1

Index of last whitespace in each item of a character vector

Answers (6)

Related Questions