Reputation: 1293
I have a vector of character strings:
cities <- c("London", "001 London", "Stockholm", "002 Stockholm")
I need to erase anything in each string that precedes first letter so that I would have:
cities <- c("London", "London", "Stockholm", "Stockholm")
I've tried e.g. this
cities <- sub("^.*?[a-zA-Z]", "", cities)
but that erases the first letter too, which I don't want to happen.
Upvotes: 0
Views: 1099
Reputation: 4554
Delete number:
gsub('\\d+','',cities)
[1] "London" " London" "Stockholm" " Stockholm"
Upvotes: 0
Reputation: 626871
Use
cities <- c("London", "001 London", "Stockholm", "002 Stockholm")
gsub("^\\P{L}*", "", cities, perl=T)
See IDEONE demo
The ^\\P{L}*
regex means:
^
- Assert the beginning of the string\\P{L}*
- 0 or more characters other than a letter.This solution is preferable if you have city names starting with Unicode letters.
Upvotes: 3
Reputation: 174706
Use a negated character class to match all the non-alphabetic characters which exists at the start.
cities <- sub("^[^a-zA-Z]*", "", cities)
or
Use capturing group to capture the first letter character.
cities <- sub("^.*?([a-zA-Z])", "\\1", cities)
Upvotes: 3