Datamaniac
Datamaniac

Reputation: 181

Row wise comparison of a dataframe in R

I have a data frame with multiple data points corresponding to each ID. When the status value is different between 2 timepoints for an ID, I want to flag the first status change. How do I achieve that in R ? Below is a sample dataset.

ID Time Status
ID1 0 X
ID1 6 X
ID1 12 Y
ID1 18 Z

Result dataset

ID Time Status Flag
ID1 0 X
ID1 6 X
ID1 12 Y 1
ID1 18 Z

Upvotes: 0

Views: 475

Answers (2)

GuedesBF
GuedesBF

Reputation: 9858

You can use mutate() with ifelse() and lag(), then replace the non-first Flag==1 with 0s with replace():

df1%>%group_by(ID)%>%
        mutate(Flag=ifelse(is.na(lag(Status)), 0,
                          as.integer(Time!=lag(Time) & Status!=lag(Status))))%>%
        group_by(ID, Flag)%>%
        mutate(Flag=replace(Flag, Flag==lag(Flag) & Flag==1, 0))

# A tibble: 4 x 4
# Groups:   ID, Flag [2]
  ID     Time Status  Flag
  <fct> <int> <fct>  <dbl>
1 ID1       0 X          0
2 ID1       6 X          0
3 ID1      12 Y          1
4 ID1      18 Z          0

Upvotes: 1

Rui Barradas
Rui Barradas

Reputation: 76402

Here is a base R solution with ave. It creates a vector y that is equal to 1 every time the previous value is different from the current one. Then the Flag is computed with diff.

y <- with(df1, ave(Status, ID, FUN = function(x) c(0, x[-1] != x[-length(x)])))
df1$Flag <- c(0, diff(as.integer(y)) != 0)

df1
#   ID Time Status Flag
#1 ID1    0      X    0
#2 ID1    6      X    0
#3 ID1   12      Y    1
#4 ID1   18      Z    0

Data

df1 <- read.table(text = "
ID  Time    Status
ID1     0   X
ID1     6   X
ID1     12  Y
ID1     18  Z                  
", header = TRUE)

Upvotes: 2

Related Questions