Reputation: 121
I have observed a number of subjects during 2-5 years and each year asked if they have had a specific symptom ("yes" or "no"). I want to count how many times this symptom-state/variable changed, ie number of shifts (from "no" to "yes" or from "yes" to "no") during the observation period (year 1 to year 5) within each subject. Unfortunately, I have som NAs where the subject did not answer. These NAs should be ignored.
subject<-c("a","b","c","d")
year1 <- c("no", "yes", NA, NA)
year2 <- c("yes", "yes", NA, "yes")
year3 <- c("no", "yes", "yes", NA)
year4 <- c("yes", "yes", NA, "no")
year5 <- c("yes", "yes", "yes", NA)
df = data.frame(subject, year1, year2, year3, year4, year5)
df
How do I create the new numerical variable "df$shifts" [Number of shifts(n)]? In this example, "df$shifts" should become 3,0,0,1.
Upvotes: 1
Views: 109
Reputation: 887128
We can loop over the rows, get the rle
of non-NA elements, extract the 'values', get the sum
of the adjacent elements that are not equal and assign it to new column 'shifts'.
df$shifts <- apply(df[-1], 1, function(x) {x1 <- rle(x[!is.na(x)])$values
sum(x1[-1]!= x1[-length(x1)])})
#[1] 3 0 0 1
Upvotes: 1