regents
regents

Reputation: 626

specifying a new column based on the criteria in another column

I'm working with school year registration data for a school since 1890 and currently have columns for the month (as a number) and the year. I would like to find a way to group these values into school years so that Aug-April are all from the same school year. For example, the 8/2010-4/2011 would be from the 2010 school year. In SAS I would have used the code below but I can't get my R code to work and I'm not sure what I'm missing. I apologize for my R code, I'm still learning. a SAS Code:

If Month="8" or Month="9" or Month= "10" or Month= "11" or Month="12" then SchoolYear=Year;
If Month= "1" or Month="2" or Month="3" or Month="4" then SchoolYear= Year-1;
If Month="5"  or Month="6" or Month="7"  then SchoolYear= "";

R Code and corresponding error:

for (i in nrow(df)) if(df$Month == 8 | df$Month == 9 |df$Month ==10| df$Month ==11 | df$Month == 12) {df$SchoolYear == df$Year} else if (df$Month == 1 | df$Month == 2 | df$Month == 3 | df$Month == 4) {df$SchoolYear == df$Year- 1} else {df$SchoolYear == "NA"}

the condition has length > 1 and only the first element will be used the condition has length > 1 and only the first element will be used

Upvotes: 3

Views: 41

Answers (1)

akrun
akrun

Reputation: 886938

We can use %in% for multiple element comparisons

library(dplyr)
df %>% 
  mutate(SchoolYear = case_when(Month %in% 8:12 ~ Year, 
                        Month %in% 1:4 ~  Year - 1L, 
                        Month %in% 5:7 ~ NA_integer_))

Based on the logic, it can be futher simplified to

df$SchoolYear <- with(df,  (NA^(Month %in% 5:7)* Year) - (Month %in% 1:4))

data

set.seed(24)
df <- data.frame(Month = sample(1:12, 30, replace = TRUE),
     Year = sample(1978:2001, 30, replace = TRUE))

Upvotes: 1

Related Questions