Reputation: 71
I want to write a regex in R to remove all words of a string containing numbers.
For example:
first_text = "a2c if3 clean 001mn10 string asw21"
second_text = "clean string
Upvotes: 3
Views: 3691
Reputation: 740
It is easier to select words with no numbers than to select and delete words with numbers:
> library(stringr)
> str1 <- "a2c if3 clean 001mn10 string asw21"
> paste(unlist(str_extract_all(str1, "(\\b[^\\s\\d]+\\b)")), collapse = " ")
[1] "clean string"
Note:
Upvotes: 4
Reputation: 33488
A bit longer than some of the answers but very tractable is to first convert the string to a vector of words, then check word by word if there are any numbers and use standard R
subsetting.
first_text_vec <- strsplit(first_text, " ")[[1]]
first_text_vec
[1] "a2c" "if3" "clean" "001mn10" "string" "asw21"
paste(first_text_vec[!grepl("[0-9]", first_text_vec)], collapse = " ")
[1] "clean string"
Upvotes: 2
Reputation: 2454
Just another alternative using gsub
trimws(gsub("[^\\s]*[0-9][^\\s]*", "", first_text, perl=T))
#[1] "clean string"
Upvotes: 3
Reputation: 887128
Try with gsub
trimws(gsub("\\w*[0-9]+\\w*\\s*", "", first_text))
#[1] "clean string"
Upvotes: 10