Conditionally modifying dataframe column value

Question

I have a dataframe that shows the cut scores that relate to different performance levels (1 through 5) on state tests. The DF looks like this:

grade <- rep(1:2, each = 5)
performance_level <- rep(1:5, 2)
score_start <- c(100, 134, 157, 170, 192, 100, 129, 142, 158, 180)
score_end <- c(134, 156, 169, 192, 220, 128, 142, 157, 179, 200)

df <- data.frame(grade, performance_level, score_start, score_end)

The problem is, the score_end in some rows is the same as the score_start in the next row (ex row 1 and 2), so a first grade student who scores a 134 will be duplicated and will show up as earning both plevel 1 and plevel 2. I would like to add 1 to the score start in row 2 so it is 135. Obviously, this problem occurs in multiple rows ( I have a large dataset). I've tried using dplyr lead and lag but I can't quite get it to behave the way I want it to. Here is the code I have tried so far:

try #1

df$score_start[which(df$score_start == lag(df$score_end)] <- df$score_start + 1

try #2

df <- df %>% mutate(score_start = ifelse(score_end == lead(score_start), score_start + 1, score_start))

Any help would be met with much appreciation from me.

BetterCallMe · Accepted Answer

Please see the logic.

for(i in 1:(nrow(df)-1)) {
  if(df$score_end[i] == df$score_start[i+1]) {
    df$score_start[i+1] = df$score_start[i+1]+1
  }
}

Conditionally modifying dataframe column value

Answers (2)

Related Questions