DanG
DanG

Reputation: 741

Serialize and export labels and values labeled SPSS data into Json row by row

Following my previous question about exporting labels and values labeled SPSS data into json here Here Suggested solution with serializeJSON() allows for the conversion of R objects to JSON keeping intact all data and attributes. Ideally the goal is to serialize the data row by row for each respondent (like the example), apparently this won't be possible using serializeJSON() on a data frame. Is there any alternative approach to transform labeled SPSS into a JSON for each respondent while capturing all data and attributes from any object.

Here is the dummy sample data with id column:

library(haven)
library(labelled)

df <- data.frame(
  id = c(1,2,3,4),
  a = labelled(c(1, 1, 2, 3), labels = c(No = 1, Yes = 2)),
  b = labelled(c(1, 1, 2, 3), labels = c(No = 1, Yes = 2, DK = 3)),
  c = labelled(c(1, 1, 2, 2), labels = c(No = 1, Yes = 2, DK = 3)),
  d = labelled(c("a", "a", "b", "c"), labels = c(No = "a", Yes = "b")),
  e = labelled_spss(
    c(1, 9, 1, 2), 
    labels = c(No = 1, Yes = 2),
    na_values = 9  ))

df1 <- df %>% 
  set_variable_labels( a = "txt1- Do you use xxx?") %>% 
  set_variable_labels( b = "txt2-Do you use xxx?") %>% 
  set_variable_labels( c = "txt3-Do you use xxx?") %>% 
  set_variable_labels( d = "txt4-Do you use xxx?") %>% 
  set_variable_labels( e = "txt5-Do you use xxx?")  

> df1
  id a b c d e
1  1 1 1 1 a 1
2  2 1 1 1 a 9
3  3 2 2 2 b 1
4  4 3 3 2 c 2

Here is the suggested solution for the conversion of R objects to JSON keeping intact all data and attributes

library(labelled)
library(jsonlite)
library(tibble) 

df1 <- df1 %>%
  as_tibble() # For prettier printing of labels

# Write json file
write(serializeJSON(df1), file = "dat.json")

This is example of the ideal expected result after conversion to json format:

[ 
  {
   id: 1, 
   question: "txt1- Do you use xxx?", 
   answer_value: 1, 
   answer_label: "my_label answer", 
   question: "txt2-Do you use xxx?",
   ...
   },
   {id: 2, 
   question: "txt1-Do you use xxx?", 
   answer_value: 2, 
   answer_label: "my_label answer", 
   question: "txt2-Do you use xxx?",
 ....
   }.
]

Upvotes: 0

Views: 72

Answers (1)

deschen
deschen

Reputation: 10996

This doesn't answer the question 100%, but I'd suggest the following JSON structure:

library(jsonlite)

[
  {
    "id": 1,
    "question1": {
      "text": "txt1- Do you use xxx?",
      "answer_value": 1,
      "answer_label": "my_label answer"
    },
    "question2": {
      "text": "txt1- Do you use xxx?",
      "answer_value": 1,
      "answer_label": "my_label answer"
    }
  },
  {
    "id": 2,
    "question1": {
      "text": "txt1- Do you use xxx?",
      "answer_value": 1,
      "answer_label": "my_label answer"
    },
    "question2": {
      "text": "txt1- Do you use xxx?",
      "answer_value": 1,
      "answer_label": "my_label answer"
    }
  }
] 

This gives you one row per ID, and then per ID you have several questions, and in each question you have the nested information about the labels, answers...

You can get such a strcuture with:

toJSON(list(list("id" = 1,
                 "question1" = list("text" = "txt1- Do you use xxx?",
                                    "answer_value" = 1, 
                                    "answer_label" = "my_label answer"),
                 "question2" = list("text" = "txt1- Do you use xxx?",
                                    "answer_value" = 1, 
                                    "answer_label" = "my_label answer")),
            list("id" = 2,
                 "question1" = list("text" = "txt1- Do you use xxx?",
                                    "answer_value" = 1, 
                                    "answer_label" = "my_label answer"),
                 "question2" = list("text" = "txt1- Do you use xxx?",
                                    "answer_value" = 1, 
                                    "answer_label" = "my_label answer"))), pretty = TRUE, auto_unbox = TRUE)

Now, the question remains open, though, how to create this strucutre from your data frame. I have dealt with such things a bit in my work, but it's a bit fiddly to create the structure. Generally, what you'd need to do is:

  • Go through your dataframe rowwise (can be done with dplyr's rowwise).
  • First creat a list column including the question text (from the variable label), the actual answer number and value label.
  • Do this for each question.
  • Glue together your id with the new list columns for each question into a new "total" list column.
  • Wrap this up into a list (so that the lists for each respondent are put together into one list) and feed that into the toJSON function.

Upvotes: 1

Related Questions