Separating multiple numbers from a list of numbers and letters in R

Question

I have a list that looks something like:

list <- c("2 chairs.", "1 chair & 4 books.", 
         "Sitting on 1 couch. Another 4 chairs & 3 books.", 
         NA, "1 chair.", 
         "3 books")

My list is actually 10k+ long, but this abbreviated list captures all the variations. I need to extract the number before chair(s) and the number before book(s) only. I prefer to end up with a list of lists where some lists will include two numbers, some lists will include one number and some lists will have only NA.

I have tried gsub() and strsplit() in a variety of ways to obtain the final result that I want with no luck.

Edit: Maybe I should have been more specific in my question above. I need the result to be numeric and not a number as a string. I would also prefer to have the NA values remain as NA. Thanks.

akrun · Accepted Answer

We can use str_extract

str_extract_all(list, "[0-9](?=\s*(books|chair[s]*))")
#[[1]]
#[1] "2"

#[[2]]
#[1] "1" "4"

#[[3]]
#[1] "4" "3"

#[[4]]
#[1] NA

#[[5]]
#[1] "1"

Separating multiple numbers from a list of numbers and letters in R

Answers (2)

Related Questions