sats
sats

Reputation: 33

Stacking time series data vertically

I am struggling with manipulation of time series data. The dataset has first Column containing information about time points of data collection, 2nd column onwards contains data from different studies.I have several hundred studies. As an example I have included sample data for 5 studies. I want to stack the dataset vertically with time and datapoints for each study. Example data set looks like data provided below:

TIME    Study1  Study2  Study3  Study4  Study5
0.00    52.12   53.66   52.03   50.36   51.34
90.00   49.49   51.71   49.49   48.48   50.19
180.00  47.00   49.83   47.07   46.67   49.05
270.00  44.63   48.02   44.77   44.93   47.95
360.00  42.38   46.28   42.59   43.25   46.87
450.00  40.24   44.60   40.50   41.64   45.81
540.00  38.21   42.98   38.53   40.08   44.78

I am looking for an output in the form of:

TIME    Study   ID
0   52.12   1
90  49.49   1
180 47  1
270 44.63   1
360 42.38   1
450 40.24   1
540 38.21   1
0   53.66   2
90  51.71   2
180 49.83   2
270 48.02   2
360 46.28   2
450 44.6    2
540 42.98   2
0   52.03   3
90  49.49   3
180 47.07   3
270 44.77   3
...

Upvotes: 3

Views: 204

Answers (2)

gung - Reinstate Monica
gung - Reinstate Monica

Reputation: 11893

This is a classic 'wide to long' dataset manipulation. Below, I show the use of the base function ?reshape for your data:

d.l <- reshape(d, varying=list(c("Study1","Study2","Study3","Study4","Study5")),
               v.names="Y", idvar="TIME", times=1:5, timevar="Study",
               direction="long")
d.l <- d.l[,c(2,1,3)]
rownames(d.l) <- NULL
d.l
#    Study TIME     Y
# 1      1    0 52.12
# 2      1   90 49.49
# 3      1  180 47.00
# 4      1  270 44.63
# 5      1  360 42.38
# 6      1  450 40.24
# 7      1  540 38.21
# 8      2    0 53.66
# 9      2   90 51.71
# 10     2  180 49.83
# 11     2  270 48.02
# 12     2  360 46.28
# 13     2  450 44.60
# 14     2  540 42.98
# 15     3    0 52.03
# 16     3   90 49.49
# 17     3  180 47.07
# ...

However, there are many ways to do this in R: the most basic reference on SO (of which this is probably a duplicate) is Reshaping data.frame from wide to long format, but there are many other relevant threads (see this search: [r] wide to long). Beyond using reshape, @lmo's method can be used, as well as methods based on the reshape2, tidyr, and data.table packages (presumably among others).

Upvotes: 3

lmo
lmo

Reputation: 38500

Here is one method using cbind and stack:

 longdf <- cbind(df$TIME, stack(df[,-1], ))
 names(longdf) <- c("TIME", "Study", "id")

This returns

longdf
   TIME Study     id
1     0 52.12 Study1
2    90 49.49 Study1
3   180 47.00 Study1
4   270 44.63 Study1
5   360 42.38 Study1
6   450 40.24 Study1
7   540 38.21 Study1
8     0 53.66 Study2
9    90 51.71 Study2
...

If you want to change id to integers as in your example, use

longdf$id <- as.integer(longdf$id)

Upvotes: 2

Related Questions