Reputation: 413

Accessing element of a split string

If I have a string,

x <- "Hello World"

How can I access the second word, "World", using string split, after

x <- strsplit(x, " ")

x[[2]] does not do anything.

Upvotes: 22

Answers (6)

Friede

Reputation: 7400

As vapply() isn't mentioned, I would like to add it:

c("Hello world", "Hi there", "Back at ya") |>
  strsplit(split = " ") |>
  vapply(FUN = "[", FUN.VALUE = character(1L), 2L)
#> [1] "world" "there" "at"

^{Created on 2023-11-30 with reprex v2.0.2}

Assumption: strsplit(x, " ") is mandatory. vapply() is generally preferred over sapply() (reference).

Upvotes: 0

ThomasIsCoding

Reputation: 101317

Probably you can play with regex using sub

> x <- c("Hello world", "Hi there", "Back at ya")

> sub(".*?\\W+(\\w+).*","\\1",x)
[1] "world" "there" "at"

Upvotes: 0

Maël

Reputation: 51974

With stringr 1.5.0, you can use str_split_i to access the ith element of a split string:

library(stringr)
x <- "Hello World"
str_split_i(x, " ", i = 2)
#[1] "World"

It is vectorized:

x <- c("Hello world", "Hi there", "Back at ya")
str_split_i(x, " ", 2)
#[1] "world" "there" "at"

Upvotes: 5

Kaleb Coberly

Reputation: 460

Another approach that might be a little easier to read and apply to a data frame within a pipeline (though it takes more lines) would be to wrap it in your own function and apply that.

library(tidyverse)

df <- data.frame(
  greetings = c( "Hello world", "Hi there", "Back at ya" )
)

split_params = function (x, sep, n) {
  # Splits string into list of substrings separated by 'sep'.
  # Returns nth substring.
  x = strsplit(x, sep)[[1]][n]
  
  return(x)
}


df = df %>%
  mutate(
    'greetings' = sapply(
      X = greetings,
      FUN = split_params,
      # Arguments for split_params.
      sep = ' ',
      n = 2
    )
  )

df

### (Output in RStudio Notebook)

greetings   second_word
<chr>       <chr>
Hello world world           
Hi there    there           
Back at ya  at          
3 rows
###

Upvotes: 0

vinit bagde

Reputation: 1

x=strsplit("a;b;c;d",";")

x

[[1]] [1] "a" "b" "c" "d"

x=as.character(x[[1]])

x

[1] "a" "b" "c" "d"

x=strsplit(x," ")

x

[[1]] [1] "a"

[[2]] [1] "b"

[[3]] [1] "c"

[[4]] [1] "d"

Upvotes: -3

rosscova

Reputation: 5580

As mentioned in the comments, it's important to realise that strsplit returns a list object. Since your example is only splitting a single item (a vector of length 1) your list is length 1. I'll explain with a slightly different example, inputting a vector of length 3 (3 text items to split):

input <- c( "Hello world", "Hi there", "Back at ya" )

x <- strsplit( input, " " )

> x
[[1]]
[1] "Hello" "world"

[[2]]
[1] "Hi"    "there"

[[3]]
[1] "Back" "at"   "ya"

Notice that the returned list has 3 elements, one for each element of the input vector. Each of those list elements is split as per the strsplit call. So we can recall any of these list elements using [[ (this is what your x[[2]] call was doing, but you only had one list element, which is why you couldn't get anything in return):

> x[[1]]
[1] "Hello" "world"

> x[[3]]
[1] "Back" "at"   "ya"

Now we can get the second part of any of those list elements by appending a [ call:

> x[[1]][2]
[1] "world"

> x[[3]][2]
[1] "at"

This will return the second item from each list element (note that the "Back at ya" input has returned "at" in this case). You can do this for all items at once using something from the apply family. sapply will return a vector, which will probably be good in this case:

> sapply( x, "[", 2 )
[1] "world" "there" "at"

The last value in the input here (2) is passed to the [ operator, meaning the operation x[2] is applied to every list element.

If instead of the second item, you'd like the last item of each list element, we can use tail within the sapply call instead of [:

> sapply( x, tail, 1 )
[1] "world" "there" "ya"

This time, we've applied tail( x, 1 ) to every list element, giving us the last item.

As a preference, my favourite way to apply actions like these is with the magrittr pipe, for the second word like so:

x <- input %>%
    strsplit( " " ) %>%
    sapply( "[", 2 )

> x
[1] "world" "there" "at"

Or for the last word:

x <- input %>%
    strsplit( " " ) %>%
    sapply( tail, 1 )

> x
[1] "world" "there" "ya"

Upvotes: 38

Accessing element of a split string

Answers (6)

Related Questions