jt.wingate
jt.wingate

Reputation: 59

Combining variables in wide format in R

I have a dataframe currently in wide format that has pre- and post- intervention questionnaire scores. There are 848 patients with at least one questionnaire. I would like to look at the data to compare how the scores changed from pre- to post-intervention. However, I am struggling since that data is in wide format and the data is not uniform in terms of the number of questionnaires per patient. There is a variable that lists the instance of the questionnaire and the value of the questionnaire is another variable.

Here is an example of how it currently looks:

a=c('instance1','total1','instance2', 'total2', 'instance3', 'total3', 
'instance4','total4', 'instance5','total5')
b=c('postop2', '5', 'postop1', '7', NA, NA, 'preop', '10', NA, NA)
c=c(NA, NA, 'preop', '3', NA, NA, 'postop1', '4', 'postop2', '3')
data.frame(rbind(a,b,c))

There are 848 rows of questionnaire data.

I need to calculate the differences from preop to postop questionnaires. What is the best way to reorganize this data so I can get these values? I am struggling since the instances do not line up between all the 848 patients.

Thanks for you help.

Upvotes: 0

Views: 88

Answers (1)

Jeppe Olsen
Jeppe Olsen

Reputation: 1008

Not sure if this is what you are after, but try this data.table approach:

library(data.table)

DT <- data.table(rbind(b,c))
names(DT) <- a

DT2 <- melt(DT, measure = patterns("^instance", "^total"), value.name = c("instance", "total"))

DT2

    variable instance total
 1:        1  postop2     5
 2:        1       NA    NA
 3:        2  postop1     7
 4:        2    preop     3
 5:        3       NA    NA
 6:        3       NA    NA
 7:        4    preop    10
 8:        4  postop1     4
 9:        5       NA    NA
10:        5  postop2     3

EDIT:

If you want a summary you could also do this:

DT2$total <- as.numeric(DT2$total)  # for some reason total is class character

DT2[, sum(total), by=instance]

   instance V1
1:  postop2  8
2:       NA NA
3:  postop1 11
4:    preop 13

Upvotes: 2

Related Questions