Reputation: 1972
I am working on a project where we frequently work with a list of usernames. We also have a function to take a username and return a dataframe with that user's data. E.g.
users = c("bob", "john", "michael")
get_data_for_user = function(user)
{
data.frame(user=user, data=sample(10))
}
We often:
users
get_data_for_user
to get their datarbind
the results into a single dataerameI am currently doing this in a purely imperative way:
ret = get_data_for_user(users[1])
for (i in 2:length(users))
{
ret = rbind(ret, get_data_for_user(users[i]))
}
This works, but my impression is that all the cool kids are now using libraries like purrr
to do this in a single line. I am fairly new to purrr
, and the closest I can see is using map_df
to convert the vector of usernames to a vector of dataframes. I.e.
dfs = map_df(users, get_data_for_user)
That is, it seems like I would still be on the hook for writing a loop to do the rbind
.
I'd like to clarify whether my solution (which works) is currently considered best practice in R / amongst users of the tidyverse.
Thanks.
Upvotes: 0
Views: 550
Reputation: 6519
For the sake of completeness, here are some additional approaches:
Reduce(rbind, lapply(users, get_data_for_user))
library(data.table)
rbindlist(lapply(users, get_data_for_user))
Upvotes: 1
Reputation: 2364
I would suggest a slight adjustment:
dfs = map_dfr(users, get_data_for_user)
map_dfr()
explicitely states that you want to do a row bind. And I would be inclined to call this best practice when working with purrr
.
Upvotes: 1
Reputation: 1006
That looks right to me - map_df
handles the rbind
internally (you'll need {dplyr} in addition to {purrr}).
FWIW, purrr::map_dfr()
will do the same thing, but the function name is a bit more explicit, noting that it will be binding rows; purrr::map_dfc()
binds columns.
Upvotes: 1