Sam
Sam

Reputation: 329

Saving output of lapply to respective data frames

I am pretty new to R. This seems like a simple question, but I just don't know the best way to approach it. I have checked similar questions but have not found the answer I am looking for.

I have a list for data frames (actually tibbles) that I want to run through the convert() function from the hablar package to convert all of the data types for each variable in the data frames. I then want to overwrite the original data frames. Here is a simplified example data frame (N.B. all of the variables are currently factors). For simplicity I have made adm2 and adm3 the same as adm1, but there are different in my real data.

adm1 <- data.frame(admV1 = as.factor(c("male", "female", "male", "female")),
                  admV2 = as.factor(c("12.2", "13.0", "14.0", "15.1")),
                  admV3 = as.factor(c("free text", "more free text", "even more free text", "free text again")),
                  admV4 = as.factor(c("2019-01-01T12:00:00", "2019-01-01T12:00:00", "2019-01-01T12:00:00", "2019-01-01T12:00:00")))

adm1 <- as_tibble(adm1)
adm2 <- adm1
adm3 <- adm1

dis1 <- data.frame(disV1 = as.factor(c("yes", "no", "yes", "no")),
                   disV2 = as.factor(c("12.2", "13.0", "14.0", "15.1")),
                   disV3 = as.factor(c("free text", "more free text", "even more free text", "free text again")),
                   disV4 = as.factor(c("2019-01-01+T12:00:00", "2019-01-01+T12:00:00", "2019-01-01+T12:00:00", "2019-01-01+T12:00:00")))

dis1 <- as_tibble(dis1)
dis2 <- dis1
dis3 <- dis1

I have two 'types' of data frames: admissions and discharges. I defined the variables that need to be converted to each data type (N.B. In my real example each is a character vector containing more than one variable name):

# Define data types
adm_chr<- admV3
adm_num<- admV2
adm_fct<- admV1
adm_dte<- admV4

dis_chr<- disV3
dis_num<- disV2
dis_fct<- disV1
dis_dte<- disV4

I have then created a list of the datasets:

# Define datasets
adm_dfs<- list(adm1, adm2, adm2)
dis_dfs<- list(dis1, dis2, dis3)

This is what I have managed so far:

# Write function
convertDataTypes<- function(dfs, type = c("adm", "dis")){
  outputs1<- dfs %>% lapply(convert(chr(paste0(type, "_chr")),
                                    num(paste0(type, "_num")),
                                    fct(paste0(type, "_fct"))))
  outputs2<- dfs %>% mutate_at(vars(paste0(type, "_dte")),
                               ymd_hms, tz = "GMT")
}

# Run function
convertDataTypes(adm_dfs, "adm")

I think I need to then use lapply over outputs1 and outputs2 to assign the variables, but there is probably a much better way of approaching this. I would be very grateful for your input.

Upvotes: 1

Views: 147

Answers (1)

akrun
akrun

Reputation: 886938

If the 'dfs' are a list of data.frames, then

library(hablar)
library(purrr)
library(dplyr)   

If the 'type' corresponds to each data.frame in the list use map2

convertDataTypes <- function(dfs, type = c("adm", "dis")) {

   map2(dfs, type, ~ {
               .type <- .y
               map(.x, ~ .x %>%              
                 convert(chr(str_c(.type, "_chr")),
                         num(str_c(.type, "_num")),
                         fct(str_c(.type, "_fct"))) %>%
                 mutate_at(vars(str_c(.type,  "_dte")),
                     ymd_hms, tz = "GMT"))

           })

}

dfsN <- list(adm_dfs, dis_dfs)

Upvotes: 1

Related Questions