Matt
Matt

Reputation: 61

Retain SPSS value labels when working with data

I am analysing student level data from PISA 2015. The data is available in SPSS format here

I can load the data into R using the read_sav function in the haven package. I need to be able to edit the data in R and then save/export the data in SPSS format with the original value labels that are included in the SPSS download intact. The code I have used is:

library(haven)
student<-read_sav("CY6_MS_CMB_STU_QQQ.sav",user_na = T)
student2<-data.frame(student)
#some edits to data
write_sav(student2,"testdata1.sav")

When my colleague (who works in SPSS) tries to open the "testdata1.sav" the value labels are missing. I've read through the haven documentation and can't seem to find a solution for this. I have also tried read/write.spss in the foreign package but have issues loading in the dataset.

I am using R version 3.4.0 and the latest build of haven.

Does anyone know if there is a solution for this? I'd be very grateful of your help. Please let me know if you require any additional information to answer this.

Upvotes: 6

Views: 1587

Answers (3)

I think you need to recover the value labels in the dataframe after importing dataset into R. Then write the that dataframe into sav file.

#load library
 libray(haven)

# load dataset
student<-read_sav("CY6_MS_CMB_STU_QQQ.sav",user_na = T)


#map to find class of each columns  
 map_dataset<-map(student, function(x)attr(x, "class"))
        
#Run for loop to identify all Factors with haven-labelled  
factor_variable<-c()  
for(i in 1:length(map_dataset)){  
   if(map_dataset[i]!="NULL"){  
   name<-names(map_dataset[i])  
   factor_variable<-c(factor_variable,name)  
   }  
}  
        
#convert all haven labelled variables into factor 
student2<-student %>%  
mutate_at(vars(factor_variable), as_factor)        

#write dataset
write_sav(student2, "testdata1.sav")

Upvotes: 0

Thomas Buhl
Thomas Buhl

Reputation: 303

library ("sjlabelled")
student <- sjlabelled::read_spss("CY6_MS_CMB_STU_QQQ.sav")
student2 <-student
write_spss(student2,"testdata1.sav")

I did not try and hope it works. The sjlabelled package is good with non-ascii-characters as German Umlaute.

But keep in mind, that R saves the labels as attributes. These attributes are lost, when doing some data transformations (as subsetting data for example). When lost in R they won't show up in SPSS of course. The sjlabelled::copy_labels function is helpful in those cases:

student2 <- copy_labels(student2, student) #after data transformations and before export to spss

Upvotes: 1

Greenleaf
Greenleaf

Reputation: 535

library(foreign)
df <- read.spss("spss_file.sav", to.data.frame = TRUE)

This may not be exactly what you are looking for, because it uses the labels as the data. So if you have an SPSS file with 0 for "Male" and 1 for "Female," you will have a df with values that are all Males and Females. It gets you one step further, but perhaps isn't the whole solution. I'm working on the same problem and will let you know what else I find.

Upvotes: 1

Related Questions