Reputation: 626
I'm working with school year registration data for a school since 1890 and currently have columns for the month (as a number) and the year. I would like to find a way to group these values into school years so that Aug-April are all from the same school year. For example, the 8/2010-4/2011 would be from the 2010 school year. In SAS I would have used the code below but I can't get my R code to work and I'm not sure what I'm missing. I apologize for my R code, I'm still learning. a SAS Code:
If Month="8" or Month="9" or Month= "10" or Month= "11" or Month="12" then SchoolYear=Year;
If Month= "1" or Month="2" or Month="3" or Month="4" then SchoolYear= Year-1;
If Month="5" or Month="6" or Month="7" then SchoolYear= "";
R Code and corresponding error:
for (i in nrow(df)) if(df$Month == 8 | df$Month == 9 |df$Month ==10| df$Month ==11 | df$Month == 12) {df$SchoolYear == df$Year} else if (df$Month == 1 | df$Month == 2 | df$Month == 3 | df$Month == 4) {df$SchoolYear == df$Year- 1} else {df$SchoolYear == "NA"}
the condition has length > 1 and only the first element will be used the condition has length > 1 and only the first element will be used
Upvotes: 3
Views: 41
Reputation: 886938
We can use %in%
for multiple element comparisons
library(dplyr)
df %>%
mutate(SchoolYear = case_when(Month %in% 8:12 ~ Year,
Month %in% 1:4 ~ Year - 1L,
Month %in% 5:7 ~ NA_integer_))
Based on the logic, it can be futher simplified to
df$SchoolYear <- with(df, (NA^(Month %in% 5:7)* Year) - (Month %in% 1:4))
set.seed(24)
df <- data.frame(Month = sample(1:12, 30, replace = TRUE),
Year = sample(1978:2001, 30, replace = TRUE))
Upvotes: 1