user2205916
user2205916

Reputation: 3456

How to find the county for each city in a vector of city names using R?

Given a string of city names, how would one go about finding the county that each city belongs to using R? I've looked at the map and acs packages but I'm not experienced with them. The goal is to found county-level data to associate with the cities in my data.

Say you have the following:

city <- c("RALEIGH", "HOLLYWOOD", "DALLAS", "MOUNTAIN VIEW", "OKLAHOMA CITY", "ORLANDO")
state <- c("NC", "CA", "TX", "CA", "OK", "FL")

Upvotes: 2

Views: 3845

Answers (1)

Roccer
Roccer

Reputation: 919

"You can get city/state information in tab-separated value format from GeoNames.org. The data is free, comprehensive and well structured. For US data, grab the US.txt file at the free postal code data page. The readme.txt file on that page describes the format." See post by Joshua Frank

## Download the file

temp <- tempfile()
download.file("http://download.geonames.org/export/zip/US.zip",temp)
con <- unz(temp, "US.txt")
US <- read.delim(con, header=FALSE)
unlink(temp)

## Find state and county

colnames(US)[c(3,5,6)] <- c("city","state","county")
US$city <- tolower(US$city)
myCityNames <- tolower(c("RALEIGH", "HOLLYWOOD", "DALLAS", "MOUNTAIN VIEW","OKLAHOMA CITY", "ORLANDO"))
myCities <- US[US$city %in% myCityNames, ]
myCities <- myCities[c("city","state","county")]
myCities <- myCities[!duplicated(myCities),]
myCities <- myCities[order(myCities$city, myCities$state, decreasing = TRUE), ]

The problem is that there are multiple cities with the same name in different states.

If you look exactly for the cities in the states you mentioned this might help:

myPlaces <- data.frame(city = myCityNames, state = c("NC", "CA", "TX", "CA", "OK", "FL"))
merge(myCities, myPlaces, by = c("city", "state") ,all.y=TRUE)

Upvotes: 2

Related Questions