Reputation: 897
I'm working on a project that requires some basic location information but little relevant information. I'm trying to write a function or a series of lines in R that takes the name of a landmark, uses that name in a Google Maps Search, and then scrapes the website to pull the city and state of the landmark.
Example
landmark.data <- data.frame(landmark.name = c("Packer Stadium", "University of Chicago", "Disney World"))
findCityState(landmark.data) # Insert amazing function here
landmark.city landmark.state
1 Green Bay WI
2 Chicago IL
3 Orlando FL
I've tried using a number of packages like rvest
and curl
, but I've struggled in the CSS selection of the city/state, and to be honest, I'm not even sure if this is even possible.
Any other approaches to this problem that you might suggest would also be appreciated, but if there's one in R, that'd be ideal.
Thanks!!
Upvotes: 1
Views: 2343
Reputation: 26258
There's probably no need to scrape the data given that Google has an API for you to access.
My Googleway package provides convenience functions for most of those, one of which being geocoding
library(googleway)
## you need an api key to use their service
key <- 'your_api_key'
landmark.data <- c("Packer Stadium", "University of Chicago", "Disney World")
lst_result <- lapply(landmark.data, function(x){
google_geocode(x, key = key)
})
The issue you may have is that there is no 'standard' address format, so you have to be a bit clever if you want to get specific pieces of address information
For example, here are the addressess listed for the first two of your search items
lapply(lst_result, function(x){
x[['results']][['address_components']]
})
# [[1]]
# [[1]][[1]]
# long_name short_name types
# 1 1265 1265 street_number
# 2 Lombardi Avenue Lombardi Ave route
# 3 Green Bay Green Bay locality, political
# 4 Brown County Brown County administrative_area_level_2, political
# 5 Wisconsin WI administrative_area_level_1, political
# 6 United States US country, political
# 7 54304 54304 postal_code
#
#
# [[2]]
# [[2]][[1]]
# long_name short_name types
# 1 5801 5801 street_number
# 2 South Ellis Avenue S Ellis Ave route
# 3 South Side South Side neighborhood, political
# 4 Chicago Chicago locality, political
# 5 Chicago Chicago administrative_area_level_3, political
# 6 Cook County Cook County administrative_area_level_2, political
# 7 Illinois IL administrative_area_level_1, political
# 8 United States US country, political
# 9 60637 60637 postal_code
And while we're here, let's see what the results have actually given us on a map
mapKey <- symbolix.utils::mapKey()
lst_coordinates <- lapply(lst_result, function(x){
x[['results']][['geometry']][['location']]
})
coordinates <- do.call('rbind', lst_coordinates)
google_map(key = mapKey) %>%
add_markers(coordinates)
Upvotes: 2