Ayush Raj Singh
Ayush Raj Singh

Reputation: 873

removing certain pattern from a string

I have a vector like below:

t <- c("8466 W Peoria Ave", "4250 W Anthem Way", .....)

I want to convert it into:

t_mod <-c("Peoria Ave", "Anthem Way".....)

That is I want to remove numbers and single characters from my vector of strings.

Any help will really be appreciated.

Upvotes: 3

Views: 5307

Answers (4)

Roland
Roland

Reputation: 132706

char <- c("8466 W Peoria Ave", "4250 W Anthem Way")
gsub("[[:digit:]]+ *[[:alpha:]].","",char)
#[1] "Peoria Ave" "Anthem Way"

Upvotes: 0

asb
asb

Reputation: 4432

I am not very good with regexs but I can take a stab, how about this:

t_mod <- gsub("^[0-9]{1,} [a-z][A-Z] ", "", t)

This will first strip any number of numerical digits at the start of the string, followed by a space, any alphabetic, and then another space. Then my t_mod looks like you needed:

t_mod
[1] "Peoria Ave" "Anthem Way"

Upvotes: 1

fdetsch
fdetsch

Reputation: 5308

Here you go:

# Data
t <- c("8466 W Peoria Ave", "4250 W Anthem Way")

# Remove numbers and split by whitespace
t.char <- sub("[[:alnum:]]* ", "", t) 
t.char.split <- strsplit(t.char, " ")

# Remove strings with only one character
t.mod <- sapply(t.char.split, function(i) {
  paste(i[which(nchar(i) > 1)], collapse = " ")
})

t.mod
[1] "Peoria Ave" "Anthem Way"

Upvotes: 1

user1609452
user1609452

Reputation: 4444

tt <- c("8466 W Peoria Ave", "4250 W Anthem Way")
gsub(" [A-Za-z] ", "", gsub("[0-9]", "", tt))
[1] "Peoria Ave" "Anthem Way"

Upvotes: 4

Related Questions