Reputation: 3092
When using purrr::map_df()
, I will occasionally pass in a list of data frames where some items are NULL
. When I do, map_df()
returns a data frame with fewer rows than the the original list.
I assume what's going on is that map_df()
calls dplyr::bind_rows()
which ignores NULL
values. However, I'm not sure how to identify my problematic rows.
Here's an example:
library(purrr)
problemlist <- list(NULL, NULL, structure(list(bounds = structure(list(northeast = structure(list(
lat = 41.49, lng = -71.46), .Names = c("lat", "lng"
), class = "data.frame", row.names = 1L), southwest = structure(list(
lat = 41.49, lng = -71.46), .Names = c("lat", "lng"
), class = "data.frame", row.names = 1L)), .Names = c("northeast",
"southwest"), class = "data.frame", row.names = 1L), location = structure(list(
lat = 41.49, lng = -71.46), .Names = c("lat", "lng"
), class = "data.frame", row.names = 1L), location_type = "ROOFTOP",
viewport = structure(list(northeast = structure(list(lat = 41.49,
lng = -71.46), .Names = c("lat", "lng"), class = "data.frame", row.names = 1L),
southwest = structure(list(lat = 41.49, lng = -71.46), .Names = c("lat",
"lng"), class = "data.frame", row.names = 1L)), .Names = c("northeast",
"southwest"), class = "data.frame", row.names = 1L)), .Names = c("bounds",
"location", "location_type", "viewport"), class = "data.frame", row.names = 1L))
# what actually happens
map_df(problemlist, 'location')
# lat lng
# 1 41.49 -71.46
# desired result
map_df_with_Null_handling(problemlist, 'location')
# lat lng
# 1 NA NA
# 2 NA NA
# 3 41.49 -71.46
I considered wrapping my location
accessor in one of purrr's error handling functions (eg. safely()
or possibly()
), but it's not that I'm running into errors--I'm just not getting the desired results.
What's the best way to handle NULL
values with map_df()
?
Upvotes: 6
Views: 1920
Reputation: 3092
You can use the (as-of-present undocumented) .null
argument for any of the map*()
functions to tell the function what to do when it encounters a NULL
value:
map_df(problemlist, 'location', .null = data_frame(lat = NA, lng = NA) )
# lat lng
# 1 NA NA
# 2 NA NA
# 3 41.49 -71.46
Upvotes: 5