Reputation: 501
I have a data.frame with multiple entries per unique id. I need to determine which rows exceed a predefined time limit of 60 seconds. I have already attached a column that is populated with the term "toolong" to indicate the row I need to split the time column. I then want to create a new row directly below the row that has "toolong" and preserve all the same information as the "parent row" except to change the action column to "l" and the time column to the previous time - 60. The parent row will contain all the same information except the action column will be changed to "for" and the time to 60 secs. There are a total of 32 columns in the original database so preserving all the contents of the row other than action and time is necessary.
Example:
id <- c(1,1,1,1,2,2,2,2)
resting <- c("f","f","toolong","f","f","f","toolong","f")
action <- c("h","h","l","d","h","h","l","d")
time <- c(90,12,120,14,90,12,110,14)
other <- c(1,2,3,4,5,6,5,4)
dat <- data.frame(cbind(id,resting,action,time,other))
How I would like it to look:
id2 resting2 action2 time2 other2
1 1 f h 90 1
2 1 f h 12 2
3 1 toolong for 60 3
4 1 toolong l 60 3
5 1 f d 14 4
6 2 f h 90 5
7 1 f h 12 6
8 2 toolong for 60 5
9 2 toolong l 50 5
10 2 f d 14 4
Thanks, Tim
Upvotes: 0
Views: 322
Reputation: 263342
dat2 <- rbind(dat, dat[ dat$resting=="toolong" , ])
dat2 <- dat2[order(rownames(dat2)), ]
dat2[duplicated(dat2), "action"] <- "l"
names(dat2) <- paste0(names(dat2), "2")
dat2
#-------
id2 resting2 action2 time2 other2
1 1 f h 90 1
2 1 f h 12 2
3 1 toolong l 120 3
31 1 toolong l 120 3
4 1 f d 14 4
5 2 f h 90 5
6 2 f h 12 6
7 2 toolong l 110 5
71 2 toolong l 110 5
8 2 f d 14 4
The other way for constructing duplicate rownames to be used as a selection vector was with mapply
and adding 1 to a logical vector. That may have some advantage as the periods in the rownames are a better "dupe" indicator.
dat[ unlist(mapply( rep, rownames(dat), 1+(dat$resting=="toolong"))) , ]
id resting action time other
1 1 f h 90 1
2 1 f h 12 2
3 1 toolong l 120 3
3.1 1 toolong l 120 3
4 1 f d 14 4
5 2 f h 90 5
6 2 f h 12 6
7 2 toolong l 110 5
7.1 2 toolong l 110 5
8 2 f d 14 4
Addressing the comment::modifications of the question:
dat2$action2 <- as.character(dat2$action2)
dat2[ dat2$resting2=="toolong" & !duplicated(dat2) , "action2"] <- "for"
dat2
id2 resting2 action2 time2 other2
1 1 f h 90 1
2 1 f h 12 2
3 1 toolong for 120 3
31 1 toolong l 120 3
4 1 f d 14 4
5 2 f h 90 5
6 2 f h 12 6
7 2 toolong for 110 5
71 2 toolong l 110 5
8 2 f d 14 4
Upvotes: 1
Reputation: 9687
First, repeat the rows that are toolong...
R>rowID <- rep(1:8, times=as.factor(resting))
R>dat2 <- dat[rowID,]
R>dat2
id resting action time other
1 1 f h 90 1
2 1 f h 12 2
3 1 toolong l 120 3
3.1 1 toolong l 120 3
4 1 f d 14 4
5 2 f h 90 5
6 2 f h 12 6
7 2 toolong l 110 5
7.1 2 toolong l 110 5
8 2 f d 14 4
Then, for the duplicated ones, subtract of 60min per previous record...
R>dups <- unlist(tapply(duplicated(rowID), rowID,cumsum))
R>dat2$time <- dat2$time - 60*dups
R>dat2[dat2$resting == "toolong", "time"] <- pmin(60, dat2[dat2$resting == "toolong", "time"] )
R>dat2
id resting action time other
1 1 f h 90 1
2 1 f h 12 2
3 1 toolong l 60 3
3.1 1 toolong l 60 3
4 1 f d 14 4
5 2 f h 90 5
6 2 f h 12 6
7 2 toolong l 60 5
7.1 2 toolong l 50 5
8 2 f d 14 4
Upvotes: 2