Ranking based on two variables

Question

I need to rank rows based on two variables and I just can't wrap my head around it.

Test data below:

df <- data.frame(A = c(12,35,55,7,6,NA,NA,NA,NA,NA), B = c(NA,12,25,53,12,2,66,45,69,43))

A  B
12 NA
35 12
55 25
7  53
6  12
NA 2
NA 66
NA 45
NA 69
NA 43

I want to calculate a third variable, C that equals A when A!=NA. When A==NA then C==B, BUT the C score should always follow that a row with A==NA should never outrank a row with A!=NA.

In the data above Max(A) should equal max(C) and max(B) only can hold the sixth highest C value, because A has five non-NA values. If A ==NA and B outranks a row with A!=NA, then some form of transformation should take place that ensures that the A!=NA row always outranks the B row in the final C score

I would like the result to look something like this:

So far the closest I can get is

df$C <- ifelse(is.na(df$A), min(df$A, na.rm=T)/df$B, df$A)

But that turns the ranking upside down when A==NA, so B==2 is ranked 6 instead of B==69

I'm not sure if I could use some kind of weights?

Any suggestions are greatly appreciated! Thanks!

Ranking based on two variables

Answers (1)

Related Questions