Reputation: 109
I used the Google Geocoding API to request location data for thousands of addresses. The content for each request was parsed as a list. The resulting list was added under the column "get_response".
I'm having major difficulties extracting individual attributes from these lists using the purrr package, and was hoping you wonderful folks could help.
library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 3.5.3
l1 <- list(results = list(list(geometry = list(location = list(lat = 41.9, lng = -87.6)))), status = "OK")
l2 <- list(results = list(list(geometry = list(location = list(lat = 35.1, lng = -70.6)))), status = "OK")
starting_df <- tribble(~name, ~get_response,
"first_location", l1,
"second_location", l2)
print(starting_df)
#> # A tibble: 2 x 2
#> name get_response
#> <chr> <list>
#> 1 first_location <named list [2]>
#> 2 second_location <named list [2]>
Below I demonstrate how I am able to extract the attribute one at a time:
pluck(starting_df[1,]$get_response, 1, "results", 1, "geometry", "location", "lat")
#> [1] 41.9
pluck(starting_df[2,]$get_response, 1, "results", 1, "geometry", "location", "lat")
#> [1] 35.1
This is my desired output:
desired_output <- tribble(~name, ~get_response, ~lat,
"first_location", l1, 41.9,
"second_location", l2, 35.1)
print(desired_output)
#> # A tibble: 2 x 3
#> name get_response lat
#> <chr> <list> <dbl>
#> 1 first_location <named list [2]> 41.9
#> 2 second_location <named list [2]> 35.1
This is my attempt at using purrr::map
new_df <- mutate(starting_df, lat = map(get_response, pluck(1, "results", 1, "geometry", "location", "lat")))
#> Error: Can't convert NULL to function
Created on 2020-04-18 by the reprex package (v0.3.0)
Does anyone know a good way to do this?
Upvotes: 3
Views: 747
Reputation: 887911
We can use map
from purrr
library(dplyr)
library(purrr)
starting_df %>%
mutate(lat = map_dbl(get_response, ~ pluck(.x, 1, 1,
'geometry', 'location', 'lat', .default = NA_real_),
.default = NA_real_))
# A tibble: 2 x 3
# name get_response lat
# <chr> <list> <dbl>
#1 first_location <named list [2]> 41.9
#2 second_location <named list [2]> 35.1
it should also work when some elements doesn't have the 'lat'
l3 <- list(results = list(list(geometry =
list(location = list( lng = -70.6)))), status = "OK")
starting_df <- tribble(~name, ~get_response,
"first_location", l1,
"second_location", l2,
"third_location", l3)
starting_df %>%
mutate(lat = map_dbl(get_response, ~ pluck(.x, 1, 1,
'geometry', 'location', 'lat', .default = NA_real_),
.default = NA_real_))
# A tibble: 3 x 3
# name get_response lat
# <chr> <list> <dbl>
#1 first_location <named list [2]> 41.9
#2 second_location <named list [2]> 35.1
#3 third_location <named list [2]> NA
Or another option is rowwise
from dplyr
starting_df %>%
rowwise %>%
mutate(lat = pluck(get_response, 1, 1, 'geometry', 'location', 'lat'))
# A tibble: 2 x 3
# Rowwise:
# name get_response lat
# <chr> <list> <dbl>
#1 first_location <named list [2]> 41.9
#2 second_location <named list [2]> 35.1
Upvotes: 1
Reputation: 46978
You can use map_dbl
from purrr
, and apply your pluck using the formula format:
starting_df %>%
mutate(lat=map_dbl(get_response,~pluck(.x,"results",1,"geometry","location","lat")))
# A tibble: 2 x 3
name get_response lat
<chr> <list> <dbl>
1 first_location <named list [2]> 41.9
2 second_location <named list [2]> 35.1
Upvotes: 3