user3304359
user3304359

Reputation: 335

How can I keep two characters after a comma?

I have data on a metropolitan area and want to extract out the city info.

An example is

test <- c("Akron, OH METRO AREA","Auburn, NY Micro Area","Boston-Cambridge, MA-NH")

And I want it to look like

"Akron, OH", "Auburn, NY", "Boston-Cambridge, MA"

So just the City, State

Upvotes: 0

Views: 167

Answers (2)

MrFlick
MrFlick

Reputation: 206243

An easy option is a stringr::str_extract

test <- c("Akron, OH METRO AREA","Auburn, NY Micro Area","Boston-Cambridge, MA-NH")
stringr::str_extract(test, "[^,]+, .{0,2}")
# [1] "Akron, OH"            "Auburn, NY"           "Boston-Cambridge, MA"

We match anything that's not a comma, then a comma-space-then up to two more character.

Upvotes: 2

akrun
akrun

Reputation: 887183

An option is sub from base R by matching one ore more space (\\s+) followed by the , followed dby the upper case letters ([A-Z]+), capture as a group ((...)), in the replacement, specify the backreference (\\1) of the captured group

sub("(,\\s+[A-Z]+).*", "\\1", test)
#[1] "Akron, OH"            "Auburn, NY"           "Boston-Cambridge, MA"

Upvotes: 4

Related Questions