Reputation: 598
I'm currently working with SPSS file and imported them into R for some cleaning up and exploratory analysis. I have to convert them back into a .sav file (SPSS file) afterwards for the rest of my team.
I used the library(sjlabelled) in order to keep all value labels and variable labels and also assured that through all manipulations the lables are kept intact (e.g. using functions like dplyr's left_join() instead of cbind(), etc.)
Now my final set is ready in R and according to R all attributes are still correct. sjlabelled's write_spss() function produces a SPSS data set with the correct labels for all numeric variables.
However at this process it converts all string variables (in my cases text response) into a numeric (factor) variable. The original text is preserved but now attached as a label describing the (made-up) numeric factor.
Any way to prevent that from happening?
I also tried it via the foreign() packages, but this skipped all labels entirely.
write.foreign(SPSS_new, "SPSS_test_new.txt", "SPSS_test_new.sps", package="SPSS")
I attach screenshots of the data and variable view in SPSS for my fictional test data set from before the import to R and after the export back to SPSS. In the third image, I marked the problems that arise.
SPSS Data view of the original file (before R) SPSS Variable view of the original file (before R)
Now after I export it back from R to SPSS:
SPSS Variable view after the export back to SPSS
Here the output in R for the structure of the data.frame that I export back to SPSS:
str(SPSS_new)
'data.frame': 10 obs. of 6 variables:
$ ID : num 1 2 3 4 5 6 7 8 9 10
..- attr(*, "label")= chr "Identifier"
..- attr(*, "format.spss")= chr "F8.2"
..- attr(*, "display_width")= int 0
$ FactorVariable.x: num 1 2 3 2 1 2 3 2 1 1
..- attr(*, "label")= chr "This is a nominal variable"
..- attr(*, "format.spss")= chr "F8.2"
..- attr(*, "display_width")= int 0
..- attr(*, "labels")= Named num 1 2 3
.. ..- attr(*, "names")= chr "male" "female" "not specified"
$ StringVariable : chr "This is a text" "This is more text" "I have space for 800 characters" "Test test test" ...
..- attr(*, "label")= chr "Qualitative text"
..- attr(*, "format.spss")= chr "A255"
..- attr(*, "display_width")= int 0
$ Ordinal : num 1 2 3 3 2 1 1 2 3 2
..- attr(*, "label")= chr "Ordinal variable"
..- attr(*, "format.spss")= chr "F8.3"
..- attr(*, "display_width")= int 0
..- attr(*, "labels")= Named num 1 2 3
.. ..- attr(*, "names")= chr "low" "medium" "high"
$ Interval : num 4.3 2.4 2.4 2.22 4.6 3 3.34 3.45 4.01 2.34
..- attr(*, "label")= chr "Interval variable"
..- attr(*, "format.spss")= chr "F8.2"
..- attr(*, "display_width")= int 0
$ FactorVariable.y: num 1 2 3 2 1 2 3 2 1 1
..- attr(*, "label")= chr "This is a nominal variable"
..- attr(*, "format.spss")= chr "F8.2"
..- attr(*, "display_width")= int 0
..- attr(*, "labels")= Named num 1 2 3
.. ..- attr(*, "names")= chr "male" "female" "not specified"
Upvotes: 2
Views: 1248
Reputation: 7832
This issue is because sjlabelled converts all variables into numeric or factors, since these can have value labels. I now changed this and skip character vectors, which seems to work:
library("sjlabelled")
data <- data.frame(stringVar = c("A", "B"), stringsAsFactors = FALSE)
str(data)
#> 'data.frame': 2 obs. of 1 variable:
#> $ stringVar: chr "A" "B"
write_spss(data, "data.sav")
#> Tidying value labels. Please wait...
#> Writing spss file to 'data.sav'. Please wait...
dataImport <- read_spss("data.sav", verbose = FALSE)
str(dataImport)
#> 'data.frame': 2 obs. of 1 variable:
#> $ stringVar: chr "A" "B"
#> ..- attr(*, "format.spss")= chr "A1"
Created on 2019-08-02 by the reprex package (v0.3.0)
You need to update sjlabelled from GitHub to check this on your computer: https://github.com/strengejacke/sjlabelled
Upvotes: 0