Reputation: 501

add new row in R based on a factor variable

I have a data.frame with multiple entries per unique id. I need to determine which rows exceed a predefined time limit of 60 seconds. I have already attached a column that is populated with the term "toolong" to indicate the row I need to split the time column. I then want to create a new row directly below the row that has "toolong" and preserve all the same information as the "parent row" except to change the action column to "l" and the time column to the previous time - 60. The parent row will contain all the same information except the action column will be changed to "for" and the time to 60 secs. There are a total of 32 columns in the original database so preserving all the contents of the row other than action and time is necessary.

Example:

id <- c(1,1,1,1,2,2,2,2)
resting <- c("f","f","toolong","f","f","f","toolong","f")
action <- c("h","h","l","d","h","h","l","d")
time <- c(90,12,120,14,90,12,110,14)
other <- c(1,2,3,4,5,6,5,4)
dat <- data.frame(cbind(id,resting,action,time,other))

How I would like it to look:

   id2 resting2 action2 time2 other2
1    1        f       h    90      1
2    1        f       h    12      2
3    1  toolong     for    60      3
4    1  toolong       l    60      3
5    1        f       d    14      4
6    2        f       h    90      5
7    1        f       h    12      6
8    2  toolong     for    60      5
9    2  toolong       l    50      5
10   2        f       d    14      4

Thanks, Tim

Upvotes: 0

Answers (2)

IRTFM

Reputation: 263342

dat2 <- rbind(dat, dat[ dat$resting=="toolong" , ])
dat2 <- dat2[order(rownames(dat2)), ]
dat2[duplicated(dat2), "action"] <- "l"
names(dat2) <- paste0(names(dat2), "2")
dat2
#-------
   id2 resting2 action2 time2 other2
1    1        f       h    90      1
2    1        f       h    12      2
3    1  toolong       l   120      3
31   1  toolong       l   120      3
4    1        f       d    14      4
5    2        f       h    90      5
6    2        f       h    12      6
7    2  toolong       l   110      5
71   2  toolong       l   110      5
8    2        f       d    14      4

The other way for constructing duplicate rownames to be used as a selection vector was with mapply and adding 1 to a logical vector. That may have some advantage as the periods in the rownames are a better "dupe" indicator.

 dat[ unlist(mapply( rep, rownames(dat), 1+(dat$resting=="toolong"))) , ]

    id resting action time other
1    1       f      h   90     1
2    1       f      h   12     2
3    1 toolong      l  120     3
3.1  1 toolong      l  120     3
4    1       f      d   14     4
5    2       f      h   90     5
6    2       f      h   12     6
7    2 toolong      l  110     5
7.1  2 toolong      l  110     5
8    2       f      d   14     4

Addressing the comment::modifications of the question:

dat2$action2 <- as.character(dat2$action2)
dat2[ dat2$resting2=="toolong" & !duplicated(dat2) , "action2"] <- "for"
dat2
   id2 resting2 action2 time2 other2
1    1        f       h    90      1
2    1        f       h    12      2
3    1  toolong     for   120      3
31   1  toolong       l   120      3
4    1        f       d    14      4
5    2        f       h    90      5
6    2        f       h    12      6
7    2  toolong     for   110      5
71   2  toolong       l   110      5
8    2        f       d    14      4

Upvotes: 1

Neal Fultz

Reputation: 9687

First, repeat the rows that are toolong...

R>rowID <- rep(1:8, times=as.factor(resting))        
R>dat2 <- dat[rowID,]
R>dat2
    id resting action time other
1    1       f      h   90     1
2    1       f      h   12     2
3    1 toolong      l  120     3
3.1  1 toolong      l  120     3
4    1       f      d   14     4
5    2       f      h   90     5
6    2       f      h   12     6
7    2 toolong      l  110     5
7.1  2 toolong      l  110     5
8    2       f      d   14     4

Then, for the duplicated ones, subtract of 60min per previous record...

R>dups <- unlist(tapply(duplicated(rowID), rowID,cumsum))
R>dat2$time <- dat2$time - 60*dups
R>dat2[dat2$resting == "toolong", "time"] <- pmin(60, dat2[dat2$resting == "toolong",     "time"] )
R>dat2
    id resting action time other
1    1       f      h   90     1
2    1       f      h   12     2
3    1 toolong      l   60     3
3.1  1 toolong      l   60     3
4    1       f      d   14     4
5    2       f      h   90     5
6    2       f      h   12     6
7    2 toolong      l   60     5
7.1  2 toolong      l   50     5
8    2       f      d   14     4

Upvotes: 2

add new row in R based on a factor variable

Answers (2)

Related Questions