Bønding
Bønding

Reputation: 77

Ranking based on two variables

I need to rank rows based on two variables and I just can't wrap my head around it.

Test data below:

df <- data.frame(A = c(12,35,55,7,6,NA,NA,NA,NA,NA), B = c(NA,12,25,53,12,2,66,45,69,43))

A  B
12 NA
35 12
55 25
7  53
6  12
NA 2
NA 66
NA 45
NA 69
NA 43

I want to calculate a third variable, C that equals A when A!=NA. When A==NA then C==B, BUT the C score should always follow that a row with A==NA should never outrank a row with A!=NA.

In the data above Max(A) should equal max(C) and max(B) only can hold the sixth highest C value, because A has five non-NA values. If A ==NA and B outranks a row with A!=NA, then some form of transformation should take place that ensures that the A!=NA row always outranks the B row in the final C score

I would like the result to look something like this:

A  B  C 
55 25 1
35 12 2  
12 NA 3 
7  53 4
6  12 5
NA 69 6
NA 66 7
NA 45 8
NA 43 9
NA 2  10

So far the closest I can get is

df$C <- ifelse(is.na(df$A), min(df$A, na.rm=T)/df$B, df$A)

But that turns the ranking upside down when A==NA, so B==2 is ranked 6 instead of B==69

A  B  C 
55 25 1
35 12 2  
12 NA 3 
7  53 4
6  12 5
NA 2  6
NA 43 7
NA 45 8
NA 66 9
NA 69 10 

I'm not sure if I could use some kind of weights?

Any suggestions are greatly appreciated! Thanks!

Upvotes: 0

Views: 759

Answers (1)

count
count

Reputation: 1338

You can try:

 df$C <- order(-df$A)
 df[is.na(df$A),"C"] <- sort.list(order(-df[is.na(df$A),"B"]))+length(which(!is.na(df$A)))

and the order for C:

df[order(df$C),]

Upvotes: 0

Related Questions