Reputation: 11
This is the code that I am trying to run and it's taking a while.
Districts is a data frame of 39299 rows and 16 columns and lm_data is a data frame of 59804 rows and 16 variables. I want to set up a new variable in lm_data called tentativeStartDate which takes on the value of districts$firstDay[j]
if a couple of conditions are meant. Is there a more efficient way to do this?
for (i in 1: nrow(lm_data)){
for (j in 1: nrow(districts)){
if (lm_data$DISTORGID[i] == districts$DISTORGID[j] & lm_data$gradeCode[i] == districts$gradeCode[j]){
lm_data$tentativeStartDate[i] = districts$firstDay[j]
}
}
}
Upvotes: 1
Views: 481
Reputation: 288
Not sure if this will work since I can't test it, but if it does work it should be much faster.
# get the indices
idx <- which(lm_data$DISTORGID == districts$DISTORGID & lm_data$gradeCode == districts$gradeCode)
lm_data$tentativeStartDate[idx] <- districts$firstDay[idx]
Upvotes: 1