user6121484
user6121484

Reputation: 143

Two levels of longitudinal data: how to reshape?

I have a data set with multiple time points of hippocampal volume for each subject. Each hippocampal volume has a left and right measurement. I now want to compare left and right change longitudinally. I know how to reshape my data for the time points, but I don't know how to add the levels of "side" to it.

So here is my reproducible data set:

mydata <- data.frame(SID=sample(1:150,400, replace=TRUE), hippLeft_T1=sample(6000:8000,400,replace=TRUE), hippRight_T1=sample(6000:8000,400,replace=TRUE),hippLeft_T2=sample(6000:8000,400,replace=TRUE), hippRight_T2=sample(6000:8000,400,replace=TRUE),hippLeft_T3=sample(6000:8000,400,replace=TRUE), hippRight_T3=sample(6000:8000,400,replace=TRUE))

This is then how I would reshape it longitudinally:

long <- reshape(mydata, direction="long", varying=list(c(2,4,6),c(3,5,7)),idvar="SID", timevar="time", v.names=c("HippLeft","HippRight"), times=c("time1","time2","time3"))

Should I apply reshape twice to get the levels for left and right in there? Or is there another way to do this? Thanks!

**What I am trying to get is the following: enter image description here

Upvotes: 4

Views: 229

Answers (1)

aichao
aichao

Reputation: 7445

One way to do this is to use a combination of unite, gather, and separate from tidyr:

library(tidyr)
long <- mydata %>% unite("times1", hippLeft_T1,hippRight_T1) %>%
                   unite("times2", hippLeft_T2,hippRight_T2) %>%
                   unite("times3", hippLeft_T3,hippRight_T3) %>%
                   gather("times","Hipp",times1:times3) %>%
                   separate(Hipp,c("Left","Right")) %>%
                   gather("Side","Hipp",Left:Right)

Notes:

  1. First unite the left and right columns for each time T1, T2, and T3 and name these columns times1, times2, and times3
  2. Then, gather these three columns naming the key column times and the value column Hipp
  3. separate the Hipp column into Left and Right
  4. gather the Left and Right columns naming the key column Side and the value column Hipp

Actually a better way is to reverse the two gather operations by first uniting over times:

library(tidyr)
long <- mydata %>% unite("Left", hippLeft_T1,hippLeft_T2,hippLeft_T3) %>%
                   unite("Right", hippRight_T1,hippRight_T2,hippRight_T3) %>%
                   gather("Side","Hipp",Left:Right) %>%
                   separate(Hipp,c("times1","times2","times3")) %>%
                   gather("times","Hipp",times1:times3)

A third approach using only one call to gather is:

library(dplyr)
library(tidyr)
long <- mydata %>% gather("Side","Hipp",-SID) %>%
                   mutate(times=paste0("times",sub(".*(\\d)$","\\1",Side)),
                          Side=sub("^hipp([A-z]+)_T.*","\\1",Side)) %>%
                   select(SID,Side,times,Hipp)

Here, the key column Side from gather have values that are the original mydata column names. We use deployer::mutate to create a duplicate of this column named times. Then we use sub with some regex to extract the last digit for the times values and to extract either Left or Right for the Side values.

Setting the seed to 123, your data is:

set.seed(123)
mydata <- data.frame(SID=sample(1:150,400, replace=TRUE), hippLeft_T1=sample(6000:8000,400,replace=TRUE), hippRight_T1=sample(6000:8000,400,replace=TRUE),hippLeft_T2=sample(6000:8000,400,replace=TRUE), hippRight_T2=sample(6000:8000,400,replace=TRUE),hippLeft_T3=sample(6000:8000,400,replace=TRUE), hippRight_T3=sample(6000:8000,400,replace=TRUE))
head(mydata)
##  SID hippLeft_T1 hippRight_T1 hippLeft_T2 hippRight_T2 hippLeft_T3 hippRight_T3
##1  44        7973         6941        7718         7279        6319         7465
##2 119        6274         6732        7775         6249        6289         7220
##3  62        7811         6242        6978         6510        6298         6448
##4 133        7153         6094        7436         7641        7029         7833
##5 142        6791         6525        6973         7608        6986         7606
##6   7        6900         7938        7978         6091        7233         6625

The result using either the second or third approach is:

print(long)
##     SID  Side  times Hipp
##   1  44  Left times1 7973
##   2 119  Left times1 6274
##   3  62  Left times1 7811
##   4 133  Left times1 7153
##   5 142  Left times1 6791
##   6   7  Left times1 6900
## ...
## 401  44 Right times1 6941
## 402 119 Right times1 6732
## 403  62 Right times1 6242
## 404 133 Right times1 6094
## 405 142 Right times1 6525
## 406   7 Right times1 7938
## ...
## 801  44  Left times2 7718
## 802 119  Left times2 7775
## 803  62  Left times2 6978
## 804 133  Left times2 7436
## 805 142  Left times2 6973
## 806   7  Left times2 7978
## ...
##1201  44 Right times2 7279
##1202 119 Right times2 6249
##1203  62 Right times2 6510
##1204 133 Right times2 7641
##1205 142 Right times2 7608
##1206   7 Right times2 6091
## ...
##1601  44  Left times3 6319
##1602 119  Left times3 6289
##1603  62  Left times3 6298
##1604 133  Left times3 7029
##1605 142  Left times3 6986
##1606   7  Left times3 7233
## ...
##2001  44 Right times3 7465
##2002 119 Right times3 7220
##2003  62 Right times3 6448
##2004 133 Right times3 7833
##2005 142 Right times3 7606
##2006   7 Right times3 6625

Upvotes: 3

Related Questions