Reputation: 329
I am pretty new to R. This seems like a simple question, but I just don't know the best way to approach it. I have checked similar questions but have not found the answer I am looking for.
I have a list for data frames (actually tibbles) that I want to run through the convert() function from the hablar package to convert all of the data types for each variable in the data frames. I then want to overwrite the original data frames. Here is a simplified example data frame (N.B. all of the variables are currently factors). For simplicity I have made adm2 and adm3 the same as adm1, but there are different in my real data.
adm1 <- data.frame(admV1 = as.factor(c("male", "female", "male", "female")),
admV2 = as.factor(c("12.2", "13.0", "14.0", "15.1")),
admV3 = as.factor(c("free text", "more free text", "even more free text", "free text again")),
admV4 = as.factor(c("2019-01-01T12:00:00", "2019-01-01T12:00:00", "2019-01-01T12:00:00", "2019-01-01T12:00:00")))
adm1 <- as_tibble(adm1)
adm2 <- adm1
adm3 <- adm1
dis1 <- data.frame(disV1 = as.factor(c("yes", "no", "yes", "no")),
disV2 = as.factor(c("12.2", "13.0", "14.0", "15.1")),
disV3 = as.factor(c("free text", "more free text", "even more free text", "free text again")),
disV4 = as.factor(c("2019-01-01+T12:00:00", "2019-01-01+T12:00:00", "2019-01-01+T12:00:00", "2019-01-01+T12:00:00")))
dis1 <- as_tibble(dis1)
dis2 <- dis1
dis3 <- dis1
I have two 'types' of data frames: admissions and discharges. I defined the variables that need to be converted to each data type (N.B. In my real example each is a character vector containing more than one variable name):
# Define data types
adm_chr<- admV3
adm_num<- admV2
adm_fct<- admV1
adm_dte<- admV4
dis_chr<- disV3
dis_num<- disV2
dis_fct<- disV1
dis_dte<- disV4
I have then created a list of the datasets:
# Define datasets
adm_dfs<- list(adm1, adm2, adm2)
dis_dfs<- list(dis1, dis2, dis3)
This is what I have managed so far:
# Write function
convertDataTypes<- function(dfs, type = c("adm", "dis")){
outputs1<- dfs %>% lapply(convert(chr(paste0(type, "_chr")),
num(paste0(type, "_num")),
fct(paste0(type, "_fct"))))
outputs2<- dfs %>% mutate_at(vars(paste0(type, "_dte")),
ymd_hms, tz = "GMT")
}
# Run function
convertDataTypes(adm_dfs, "adm")
I think I need to then use lapply over outputs1 and outputs2 to assign the variables, but there is probably a much better way of approaching this. I would be very grateful for your input.
Upvotes: 1
Views: 147
Reputation: 886938
If the 'dfs' are a list
of data.frame
s, then
library(hablar)
library(purrr)
library(dplyr)
If the 'type' corresponds to each data.frame
in the list
use map2
convertDataTypes <- function(dfs, type = c("adm", "dis")) {
map2(dfs, type, ~ {
.type <- .y
map(.x, ~ .x %>%
convert(chr(str_c(.type, "_chr")),
num(str_c(.type, "_num")),
fct(str_c(.type, "_fct"))) %>%
mutate_at(vars(str_c(.type, "_dte")),
ymd_hms, tz = "GMT"))
})
}
dfsN <- list(adm_dfs, dis_dfs)
Upvotes: 1