R - Split numeric vector into intervals

Question

I have a question regarding the "splitting" of a vector, although different approaches might be feasible. I have a data.frame(df) which looks like this (simplified version):

The "time" variable counts units of time (days, weeks etc) until an event occurs. I would like to expand the data set by increasing the number of rows and "split" the "time" into intervals of length 1, beginning at 2. The result might then look something like this:

    case    time    begin   end
1   1       5       2       3
2   1       5       3       4
3   1       5       4       5
4   2       3       2       3
5   3       4       2       3
6   3       4       3       4

Obviously, my data set is a bit larger than this example. What would be a feasible method to achieve this result?

I had one idea of beginning with

df.exp <- df[rep(row.names(df), df$time - 2), 1:2]

in order to expand the number of rows per case, according to the number of time intervals. Based on this, a "begin" and "end" column might be added in the fashion of:

df.exp$begin <- 2:(df.exp$time-1)

However, I'm not successful at creating the respective columns, because this command only uses the first row to calculate (df.exp$time-1) and doesn't automatically distinguish by "case".

Any ideas would be very much appreciated!

akrun · Accepted Answer

You can try

df2 <- df1[rep(1:nrow(df1), df1$time-2),]
row.names(df2) <- NULL
m1 <- do.call(rbind,
          Map(function(x,y) {
                  v1 <- seq(x,y)
                  cbind(v1[-length(v1)],v1[-1L])},
                  2, df1$time))
df2[c('begin', 'end')] <- m1
df2
#  case time begin end
#1    1    5     2   3
#2    1    5     3   4
#3    1    5     4   5
#4    2    3     2   3
#5    3    4     2   3
#6    3    4     3   4

Or an option with data.table

library(data.table)
setDT(df1)[,{tmp <- seq(2, time)
               list(time= time,
                    begin= tmp[-length(tmp)],
                    end=tmp[-1])} , by = case]
#   case time begin end
#1:    1    5     2   3
#2:    1    5     3   4
#3:    1    5     4   5
#4:    2    3     2   3
#5:    3    4     2   3
#6:    3    4     3   4

R - Split numeric vector into intervals

Answers (2)

Related Questions