Reputation: 3765
This is a solution I found to work with labelled data from SPSS in R.
I'm working with a survey provided in SPSS and I moved from foreign
to haven
.
I read Convenient way to access variables label after importing Stata data with haven and I could not find a way to express my labelled variables as factors.
What I tried was to extract the attributes
using purrr
package and then covert some variables to factor. No success!
library(dplyr)
library(haven)
library(purrr)
library(sjlabelled)
url = "http://users.dcc.uchile.cl/~mvargas/auxiliares_cc5208/nesi_individuals_with_grants_2015_spss.zip"
zip = paste0(getwd(),"/nesi_individuals_with_grants_2015_spss.zip")
sav = paste0(getwd(),"/nesi_individuals_with_grants_2015.sav")
download.file(url, zip, method="curl")
system(paste0("7z e ",zip," -oc:",getwd()))
nesi_individuals_with_grants = tbl_df(read_sav(sav))
# as expected the variables have no levels
# B14 is a variable that refers to where do people work (e.g. 1= startup, 2= bank, 3 = hospital, etc)
levels(nesi_individuals_with_grants$B14)
classifications_all = tbl_df(nesi_individuals_with_grants) %>%
select(OCUP_REF,SEXO,CISE,CINE,B1,B14,C1) %>%
rename(occupation_id = OCUP_REF, sex_id = SEXO, icse_id = CISE, isced_id = CINE,
isco_id = B1, journey_id = C1)
occupation = classifications_all %>%
select(occupation_id) %>%
mutate(occupation = get_label(occupation_id)) %>%
distinct()
That returns
# A tibble: 3 x 2
occupation_id occupation
<dbl+lbl> <chr>
1 1 Binario Ocupados de Referencia Tabulados de Personas
2 NaN Binario Ocupados de Referencia Tabulados de Personas
3 0 Binario Ocupados de Referencia Tabulados de Personas
Which is the variable label, then I try
occupation = classifications_all %>%
select(occupation_id) %>%
distinct() %>%
filter(!is.nan(occupation_id)) %>%
mutate(occupation = get_labels(occupation_id))
It works !
> occupation
# A tibble: 2 x 2
occupation_id occupation
<dbl+lbl> <chr>
1 1 Ocupados con menos de 1 mes en el empleo actual
2 0 Ocupados con más de 1 mes en el empleo actual
Upvotes: 1
Views: 1368
Reputation: 7832
Do you want to set value labels as factor levels? Then you could try sjlabelled::as_label()
or sjmisc::to_label()
(which both are the same, it's just that I did not completely remove to_label from sjmisc, but kept it for backwards compatibility).
Upvotes: 3