Reputation: 13
I had a data frame with 184 obs. of 5 variables:
'data.frame': 184 obs. of 5 variables:
$ Cat : Factor w/ 10 levels "99-001","99-002",..: 1 1 1 1 1 1 1 1 1 1 ...
$ No : int 1 1 1 1 1 1 1 1 1 1 ...
$ ehs : int 0 0 0 0 0 0 0 0 0 0 ...
$ Onset : int 0 0 0 0 0 0 0 9 9 9 ...
$ STARTING: Factor w/ 149 levels "1:37PM1","1:42PM1",..: 3 4 5 63 64 65 66 67...
The data frame comes from a repeated measurement study, that means each case was measured several times:
Now I want to create a new variable (provoke) by judging the onset situation of each case. If the onset is "0" first, than the new variable (provoke) will be coded as "0", otherwise "1".
The R script of mine :
no1 <- seq[seq$No == 1, ]
if (no1[1,4]==0) {no1$provoke =0} else {no1$provoke =1}
no2 <- seq[seq$No == 2, ]
if (no2[1,4]==0) {no2$provoke = 0} else {no2$provoke = 1}
For the large case number, I intend to write a loop to finish the task
for (i in 1:10) {
noi <- seq[seq$No == i, ]
if (noi[1,4]==0) {
noi$provoke = 0}
else {noi$provoke = 1}
}
but the loop seems not functioned. Could you please help me find out the bug or point out my mistake?
Upvotes: 0
Views: 221
Reputation: 70623
seq
is a really bad name to choose for a data.frame. Let's call it xy
for this one example.
xy <- data.frame(case = rep(1:5, each = 10), oldvar = rbinom(50, size = 1, prob = 0.5))
xy.split <- split(xy, f = xy$case)
manipulateXY <- function(x) {
if (x[1, "oldvar"] == 0) {
x$newvar <- 0
} else {
x$newvar <- 1
}
x
}
xy.newvar <- lapply(xy.split, FUN = manipulateXY)
xy.new <- do.call("rbind", xy.newvar)
xy.new
Another way of going about this would be as the following. This one assumes the data is ordered by case
.
# find first occurrence
zero.or.not <- do.call("rbind", lapply(xy.split, FUN = function(x) x[1, ]))$oldvar
# count number of rows
num.rows <- unlist(lapply(xy.split, FUN = nrow))
xy.new$newvar2 <- rep(zero.or.not, times = num.rows)
xy.new
Upvotes: 1