krfurlong
krfurlong

Reputation: 897

Scraping City and State from Google Maps in R

I'm working on a project that requires some basic location information but little relevant information. I'm trying to write a function or a series of lines in R that takes the name of a landmark, uses that name in a Google Maps Search, and then scrapes the website to pull the city and state of the landmark.

Example

landmark.data <- data.frame(landmark.name = c("Packer Stadium", "University of Chicago", "Disney World"))
findCityState(landmark.data) # Insert amazing function here

  landmark.city landmark.state
1     Green Bay             WI
2       Chicago             IL
3       Orlando             FL

I've tried using a number of packages like rvest and curl, but I've struggled in the CSS selection of the city/state, and to be honest, I'm not even sure if this is even possible.

Any other approaches to this problem that you might suggest would also be appreciated, but if there's one in R, that'd be ideal.

Thanks!!

Upvotes: 1

Views: 2343

Answers (1)

SymbolixAU
SymbolixAU

Reputation: 26258

There's probably no need to scrape the data given that Google has an API for you to access.

My Googleway package provides convenience functions for most of those, one of which being geocoding

library(googleway)

## you need an api key to use their service
key <- 'your_api_key'

landmark.data <- c("Packer Stadium", "University of Chicago", "Disney World")

lst_result <- lapply(landmark.data, function(x){
    google_geocode(x, key = key)
})

The issue you may have is that there is no 'standard' address format, so you have to be a bit clever if you want to get specific pieces of address information

For example, here are the addressess listed for the first two of your search items

lapply(lst_result, function(x){
    x[['results']][['address_components']]
})

# [[1]]
# [[1]][[1]]
# long_name   short_name                                  types
# 1            1265         1265                          street_number
# 2 Lombardi Avenue Lombardi Ave                                  route
# 3       Green Bay    Green Bay                    locality, political
# 4    Brown County Brown County administrative_area_level_2, political
# 5       Wisconsin           WI administrative_area_level_1, political
# 6   United States           US                     country, political
# 7           54304        54304                            postal_code
# 
# 
# [[2]]
# [[2]][[1]]
# long_name  short_name                                  types
# 1               5801        5801                          street_number
# 2 South Ellis Avenue S Ellis Ave                                  route
# 3         South Side  South Side                neighborhood, political
# 4            Chicago     Chicago                    locality, political
# 5            Chicago     Chicago administrative_area_level_3, political
# 6        Cook County Cook County administrative_area_level_2, political
# 7           Illinois          IL administrative_area_level_1, political
# 8      United States          US                     country, political
# 9              60637       60637                            postal_code

And while we're here, let's see what the results have actually given us on a map

mapKey <- symbolix.utils::mapKey()

lst_coordinates <- lapply(lst_result, function(x){
    x[['results']][['geometry']][['location']]
})

coordinates <- do.call('rbind', lst_coordinates)

google_map(key = mapKey) %>%
    add_markers(coordinates)

enter image description here

Upvotes: 2

Related Questions