Alexey Ferapontov
Alexey Ferapontov

Reputation: 5169

R: Abbreviate state names in strings

I have strings with state names in them. How do I efficiently abbreviate them? I am aware of state.abb[grep("New York", state.name)] but this works only if "New York" is the whole string. I have, for example, "Walmart, New York". Thanks in advance!

Let's assume this input:

x = c("Walmart, New York", "Hobby Lobby (California)", "Sold in Sears in Illinois")

Edit: desired outputs will be a la "Walmart, NY", "Hobby Lobby (CA)", "Sold in Sears in IL". As you can see from here, state can appear in many ways in a string

Upvotes: 3

Views: 694

Answers (1)

Josh O'Brien
Josh O'Brien

Reputation: 162401

Here's a base R way, using gregexpr(), regmatches(), and regmatches<-(), :

abbreviateStateNames <- function(x) {
    pat <- paste(state.name, collapse="|")
    m <- gregexpr(pat, x)
    ff <- function(x) state.abb[match(x, state.name)]
    regmatches(x, m) <- lapply(regmatches(x, m), ff)
    x
}

x <- c("Hobby Lobby (California)", 
       "Hello New York City, here I come (from Greensboro North Carolina)!")

abbreviateStateNames(x)
# [1] "Hobby Lobby (CA)"                                
# [2] "Hello NY City, here I come (from Greensboro NC)!"

Alternatively -- and quite a bit more naturally -- you can accomplish the same thing using the gsubfn package:

library(gsubfn)

pat <- paste(state.name, collapse="|")
gsubfn(pat, function(x) state.abb[match(x, state.name)], x)
[1] "Hobby Lobby (CA)"                                
[2] "Hello NY City, here I come (from Greensboro NC)!"

Upvotes: 6

Related Questions