Reputation: 77
please help me... I have a data set containing information about academic degrees per year, like this:
Year1 Deg_Year1 Year2 Deg_Year2 Year3 Deg_Year3 Year4 Deg_Year4 Year5 Deg_Year5
2001 College 2004 Master NA NA NA NA NA NA
2004 College 2004 Master 2010 PHD NA NA NA NA
2006 Master 2006 College NA NA NA NA NA NA
2016 Master NA NA NA NA NA NA NA NA
2002 Master 2003 Master 2004 College 2004 Master NA NA
2002 Master 2002 College NA NA NA NA NA NA
I want to obtain a data frame that contains the year and the highest academic degree obtained before 2015, like this:
YearX Highest_Degree
2004 Master
2010 PHD
2006 Master
NA NA
2004 Master
2002 Master
Can someone please help me?
Thank you !
Upvotes: 1
Views: 40
Reputation: 887881
We can create a vector
of degrees in an order and then match
against the 'Deg_Year' columns, get the maximum value in each row with max.col
to subset the values and the corresponding 'Year' in each row
v1 <- c('Master', 'PHD')
nm1 <- grep('Deg', names(df1))
m1 <- sapply(df1[nm1], match, table = v1, nomatch = 0)
i1 <- max.col(m1) * NA^(!rowSums(m1!=0))
YearX <- df1[nm1-1][cbind(seq_len(nrow(df1)), i1)]
Highest_Degree <- df1[nm1][cbind(seq_len(nrow(df1)), i1)]
data.frame(YearX, Highest_Degree)
# YearX Highest_Degree
#1 2004 Master
#2 2010 PHD
#3 2006 Master
#4 NA <NA>
#5 2004 Master
#6 2002 Master
df1 <- structure(list(Year1 = c(2001L, 2004L, 2006L, 2016L, 2002L, 2002L
), Deg_Year1 = c("College", "College", "Master", "College", "Master",
"Master"), Year2 = c(2004L, 2004L, 2006L, NA, 2003L, 2002L),
Deg_Year2 = c("Master", "Master", "College", NA, "Master",
"College"), Year3 = c(NA, 2010L, NA, NA, 2004L, NA), Deg_Year3 = c(NA,
"PHD", NA, NA, "College", NA), Year4 = c(NA, NA, NA, NA,
2004L, NA), Deg_Year4 = c(NA, NA, NA, NA, "Master", NA),
Year5 = c(NA, NA, NA, NA, NA, NA), Deg_Year5 = c(NA, NA,
NA, NA, NA, NA)), .Names = c("Year1", "Deg_Year1", "Year2",
"Deg_Year2", "Year3", "Deg_Year3", "Year4", "Deg_Year4", "Year5",
"Deg_Year5"), class = "data.frame", row.names = c(NA, -6L))
Upvotes: 2