Reputation: 335
I have data on a metropolitan area and want to extract out the city info.
An example is
test <- c("Akron, OH METRO AREA","Auburn, NY Micro Area","Boston-Cambridge, MA-NH")
And I want it to look like
"Akron, OH", "Auburn, NY", "Boston-Cambridge, MA"
So just the City, State
Upvotes: 0
Views: 167
Reputation: 206243
An easy option is a stringr::str_extract
test <- c("Akron, OH METRO AREA","Auburn, NY Micro Area","Boston-Cambridge, MA-NH")
stringr::str_extract(test, "[^,]+, .{0,2}")
# [1] "Akron, OH" "Auburn, NY" "Boston-Cambridge, MA"
We match anything that's not a comma, then a comma-space-then up to two more character.
Upvotes: 2
Reputation: 887183
An option is sub
from base R
by matching one ore more space (\\s+
) followed by the ,
followed dby the upper case letters ([A-Z]+
), capture as a group ((...)
), in the replacement
, specify the backreference (\\1
) of the captured group
sub("(,\\s+[A-Z]+).*", "\\1", test)
#[1] "Akron, OH" "Auburn, NY" "Boston-Cambridge, MA"
Upvotes: 4