Reputation: 57
I am struggling with creating efficient code for importing SAS data file.
My code is the follow:
library(foreign)
library(haven)
f <- file.path(path = "E:/Cohortdata/Raw cohort/Nationalscreeningcohort/01.jk",
c("nhis_heals_jk_2002.sas7bdat","nhis_heals_jk_2003.sas7bdat" ,"nhis_heals_jk_2004.sas7bdat",
"nhis_heals_jk_2005.sas7bdat","nhis_heals_jk_2006.sas7bdat","nhis_heals_jk_2007.sas7bdat",
"nhis_heals_jk_2008.sas7bdat","nhis_heals_jk_2009.sas7bdat","nhis_heals_jk_2010.sas7bdat", "nhis_heals_jk_2011.sas7bdat","nhis_heals_jk_2012.sas7bdat","nhis_heals_jk_2013.sas7bdat"))
d <- lapply (f, read_sas)
I know rewriting it with for loop would be much more efficient, but don't know how the code should be look like
I would be very thankful if you help me.
Upvotes: 1
Views: 1119
Reputation: 4863
It's a variation of a code that I posted here but you can use it for SAS files too.
Please note that instead of using file.path()
I used list.files()
. That allowed me to read all the files in the path "E:/Cohortdata/Raw cohort/Nationalscreeningcohort"
, which is where I assumed your files are. In addition, I used the argument pattern
to look only for sas7bdat
files.
list.files()
returns a vector, here you can use your *apply
method that you'd like. However, I like changing the vector
to tbl_df
and to use the the tidyverse
approach. Which means reading all the files using purrr::map()
(part of tidyverse
) and create a big data tbl_df
of all of the files.
library(tidyverse)
library(foreign)
library(haven)
df <- list.files(path = "E:/Cohortdata/Raw cohort/Nationalscreeningcohort",
full.names = TRUE,
recursive = TRUE,
pattern = "*.sas7bdat") %>%
tbl_df() %>%
mutate(data = map(value, read_sas)) %>%
unnest(data)
Upvotes: 4