VincePi
VincePi

Reputation: 45

R How to generate an incremental sequence based on a time segmented sequence

I'm trying to convert a data file of the form 'initial.df' to 'final.df' and my skills in programming and R are being seriously tested. I continue to try various approaches but without success.

# minimal initial data structure
initial.df = cbind.data.frame(dtime = as.POSIXct(c("12:30", "12:31", "12:32", 
              "13:10","13:11","13:12","20:14","20:15", "20:160"), format="%H:%M"),
              flow=c(120, 100, 90, 110, 100, 95, 115, 100, 95))
initial.df

# minimal final data structure
final.df = cbind.data.frame(initial.df, cycle=c(rep(1, 3), rep(2,3), rep(3,3)))
final.df

As background, the data file is data logged from a membrane bioreactor every minute during filtration and there are filtration gaps that separate each cycle. Each cycle runs for several hours. Thank you in advance for your assistance. Vince Thanks, Vince

Updated data set to better reflect the actual type of data:

 initial.df = cbind.data.frame(dtime = as.POSIXct(c("2015-12-18 23:58",
    "2015-12-18 23:59", "2015-12-19 00:01", "2015-12-19 00:02", "2015-12-19 4:58",
    "2015-12-19 04:59", "2015-12-19 05:00", "2015-12-19 05:01", "2015-12-19 5:02",
    "2015-12-19 07:59", "2015-12-19 08:00", "2015-12-19 08:01", "2015-12-19 8:02"), format="%Y-%m-%d %H:%M"), flow=c(120, 100, 90, 80, 75, 110, 100, 95, 85,  115, 100, 95, 90))
    initial.df

# final data structure
final.df = cbind.data.frame(initial.df, cycle=c(rep(1, 4), rep(2,5), rep(3,4)))
final.df

Upvotes: 1

Views: 136

Answers (1)

akrun
akrun

Reputation: 887118

We could cut the 'dtime' with breaks specified as '1 hour' to create a grouping variable, then get the difference between adjacent elements (diff), check which element is greater than 1, and calculate the cumulative sum after appending TRUE value at the beginning (as the diff output length is 1 less than the length of the 'dtime' column)

initial.df$cycle <- cumsum(c(TRUE,diff(cut(initial.df$dtime, 
                            breaks='1 hour', labels=FALSE))>1))

Upvotes: 2

Related Questions