Reputation: 353
I have a similar question. I am trying to fetch coordinates (latitude and longitude) for an address from US Census geocoder link. I have followed the approach mentioned here; however, I am not getting the required result. Let me put down the steps that I have followed during 3 attempts:
Attempt #1 (using RCurl
):
url_geo <- "http://geocoding.geo.census.gov/geocoder/locations/address?form"
td.html <- getForm(url_geo,
submit = "Find",
street = "3211 Providence Dr",
city = "Anchorage",
state = "AK",
zip = "99508",
benchmark = "Public_AR_Current",
.opts = curlOptions(ssl.verifypeer = FALSE))
When I see the output of td.html
, it is same as what you get when you do "View Page Source" of above webpage. Actually, td.html should instead contain the details of resulting page that appear after submitting form in above webpage.
Attempt #2 (Using httr
):
url_geo <- "http://geocoding.geo.census.gov/geocoder/locations/address?form"
fd1 <- list(
submit = "Find",
street = "3211 Providence Dr",
city = "Anchorage",
state = "AK",
zip = "99508",
benchmark = "Public_AR_Current"
)
resp1<-GET(url_geo, body=fd1, encode="form")
content(resp1)
The content of resp1 is very different from what one would expect.
Attempt #3 (Using rvest
):
url_geo <- "http://geocoding.geo.census.gov/geocoder/locations/address?form"
s <- html_session(url_geo)
f0 <- html_form(s)
Here, I get an error:
Error: Current page doesn't appear to be html.
Please help me understand what I am doing wrong. If you need any clarification from me, please let me know.
Upvotes: 2
Views: 506
Reputation: 78832
The Census site is being nice enough to send you back JSON (that was unexpected and a nice bonus from doing this call):
library(httr)
library(jsonlite)
URL <- "http://geocoding.geo.census.gov/geocoder/locations/address"
res <- GET(URL,
query=list(street="3211 Providence Dr",
city="Anchorage",
state="AK",
zip="99508",
benchmark=4))
dat <- fromJSON(content(res, as="text"))
str(dat$result$addressMatches)
## 'data.frame': 1 obs. of 4 variables:
## $ matchedAddress : chr "3211 PROVIDENCE DR, ANCHORAGE, AK, 99508"
## $ coordinates :'data.frame': 1 obs. of 2 variables:
## ..$ x: num -150
## ..$ y: num 61.2
## $ tigerLine :'data.frame': 1 obs. of 2 variables:
## ..$ tigerLineId: chr "638504877"
## ..$ side : chr "L"
## $ addressComponents:'data.frame': 1 obs. of 12 variables:
## ..$ fromAddress : chr "3001"
## ..$ toAddress : chr "3399"
## ..$ preQualifier : chr ""
## ..$ preDirection : chr ""
## ..$ preType : chr ""
## ..$ streetName : chr "PROVIDENCE"
## ..$ suffixType : chr "DR"
## ..$ suffixDirection: chr ""
## ..$ suffixQualifier: chr ""
## ..$ city : chr "ANCHORAGE"
## ..$ state : chr "AK"
## ..$ zip : chr "99508"
You can use the flatten
parameter to fromJSON
to deal with those data frames within a data frame horrible data structure:
dat <- fromJSON(content(res, as="text"), flatten=TRUE)
dplyr::glimpse(dat$result$addressMatches)
## Observations: 1
## Variables: 17
## $ matchedAddress (chr) "3211 PROVIDENCE DR, ANCHORAGE, AK, 99508"
## $ coordinates.x (dbl) -149.8188
## $ coordinates.y (dbl) 61.18985
## $ tigerLine.tigerLineId (chr) "638504877"
## $ tigerLine.side (chr) "L"
## $ addressComponents.fromAddress (chr) "3001"
## $ addressComponents.toAddress (chr) "3399"
## $ addressComponents.preQualifier (chr) ""
## $ addressComponents.preDirection (chr) ""
## $ addressComponents.preType (chr) ""
## $ addressComponents.streetName (chr) "PROVIDENCE"
## $ addressComponents.suffixType (chr) "DR"
## $ addressComponents.suffixDirection (chr) ""
## $ addressComponents.suffixQualifier (chr) ""
## $ addressComponents.city (chr) "ANCHORAGE"
## $ addressComponents.state (chr) "AK"
## $ addressComponents.zip (chr) "99508"
This wraps it into a function for easier calling:
#' Geocode address using the Census API
#'
#' @param steet Street
#' @param city City
#' @param state State
#' @param zip Zip code
#' @param benchmark "\code{current}" for this most current information,
#' "\code{2014}" for data from the 2014 U.S. ACS survey,
#' "\code{2010}" for data from the 2010 U.S. Census. This defaults
#' to "\code{current}".
#' @result \code{list} of query params and response values. If successful,
#' the geocoded values will be in \code{var$result$addressMatches}
census_geocode <- function(street, city, state, zip, benchmark="current") {
URL <- "http://geocoding.geo.census.gov/geocoder/locations/address"
bench <- c(`current`=4, `2014`=8, `2010`=9)[benchmark]
res <- GET(URL,
query=list(street=street, city=city, state=state,
zip=zip, benchmark=bench))
warn_for_status(res)
fromJSON(content(res, as="text"), flatten=TRUE)
}
census_geocode("3211 Providence Dr", "Anchorage", "AK", "99508")
Upvotes: 3
Reputation: 24490
Build your URL and submit the resulting URL directly, bypassing any form! For instance, with the parameters you selected, you obtain the following URL:
urlgeo<-"http://geocoding.geo.census.gov/geocoder/locations/address?street=3211+Providence+Dr&city=Anchorage&state=AK&zip=99508&benchmark=4"
Then, you can simply retrieve the content through getURL
:
getURL(urlgeo)
will have all the needed info. To build the URL, just paste
its arguments, replacing any blank space with a +
.
Upvotes: 0