Reputation: 428
I have a list of datasets that I obtained from multiple imputation. I would like to now recategorise a variable within this list of datasets. I have tried using the map function from purrr, I have not had much luck with this as per the code below.
Is is possible to actually map a function that regorups and recodes a variable using purr?
# download pacman package if not installed, otherwise load it
if(!require(pacman)) install.packages(pacman)
# loads relevant packages using the pacman package
pacman::p_load(
dplyr, # for pipes and manipulation
mice ) # for imputation
# make 10 dataset using mice
nhanes_imp <- parlmice(nhanes,
m = 10,
cluster.seed = 1234)
# mut imputed datasets into a list
nhanes_imp <- nhanes_imp$imp
# create function to categorise chl
chl_funct <- function(x) {
if (x == "0") {
"0 days"
} else if (x < 100) {
"< 100"
} else if (x >= 100 & x < 150) {
"100 - 149"
} else if (x >= 150 & x < 200) {
"150 - 199"
} else if (x >= 200) {
">= 200"
}
# use the new function to categorise the chl var
nhanes_imp %>%
map_df(.$chl,
chl_funct)
When I run the code, this is the error that i get:
<error/rlang_error>
Can't convert a `data.frame` object to function
Backtrace:
1. nhanes_imp %>% map_df(.$chl, chl_funct)
2. purrr::map_df(., .$chl, chl_funct)
4. purrr:::as_mapper.default(.f, ...)
5. rlang::as_function(.f)
6. rlang:::abort_coercion(x, friendly_type("function"))
Upvotes: 1
Views: 172
Reputation: 887203
We can use cut
chl_funct <- function(x) {
cut(x, breaks = c(-Inf, 0, 100, 150, 200, Inf), labels = c('0 days',
"< 100", "100 - 149", "150 - 199", ">=200"))
}
Then use
library(dplyr)
nhanes_imp$chl <- nhanes_imp$chl %>%
mutate(across(everything(), chl_funct))
Upvotes: 2
Reputation: 388992
First you should use a vectorised version in your function. This can be done with ifelse
or case_when
, if you have many more categories using cut
would be better.
library(dplyr)
chl_funct <- function(x) {
case_when(x == 0 ~ "0 days",
x < 100 ~ " < 100",
x >= 100 & x < 150 ~ "100 - 149",
x >= 150 & x < 200 ~ "150 - 199",
TRUE ~ ">= 200")
}
You can then apply this function to every column of the dataset in nhanes_imp$chl
.
nhanes_imp$chl <- nhanes_imp$chl %>% mutate(across(.fns = chl_funct))
Upvotes: 2