Rmg
Rmg

Reputation: 73

adding a column to a data frame in R based on the rank of another column

Here is a reproducible example of my data. For the following data frame:

df <- data.frame(Subject = c('John', 'John', 'John', 'John','Mary', 'Mary', 'Mary', 'Mary'),
                 SNR = c(-4,-4,0,4,0,4,4,8))

I would like to add a column 'rank' that provides a ranking for SNR by Subject, so that it would look like this:

Subject   SNR   Rank
John      -4    1
John      -4    1
John       0    2
John       4    3
Mary       0    1
Mary       4    2
Mary       4    2
Mary       8    3

I have tried using:

dfNew <- transform(df, Rank = ave(SNR, Subject, FUN = function(x) rank(x, ties.method = "first")))

But I get the following:

Subject   SNR   Rank
John      -4    1
John      -4    2
John       0    3
John       4    4
Mary       0    1
Mary       4    2
Mary       4    3
Mary       8    4   

I have also tried using the different ties.method options, but none give me what I am looking for (i.e., ranking only from 1-3).

Any help would be much appreciated!

Upvotes: 6

Views: 589

Answers (4)

Haboryme
Haboryme

Reputation: 4761

A bit dirty but it seems to work:

library(dplyr)
df %>% group_by(Subject) %>% mutate(Rank = as.numeric(as.factor(SNR))) 

  Subject   SNR  Rank
   <fctr> <dbl> <dbl>
1    John    -4     1
2    John    -4     1
3    John     0     2
4    John     4     3
5    Mary     0     1
6    Mary     4     2
7    Mary     4     2
8    Mary     8     3

Upvotes: 1

Jaap
Jaap

Reputation: 83215

Another base R method:

transform(df1, Rank = ave(SNR, Subject, FUN = function(x) cumsum(c(TRUE, head(x, -1) != tail(x, -1)))))

gives:

  Subject SNR Rank
1    John  -4    1
2    John  -4    1
3    John   0    2
4    John   4    3
5    Mary   0    1
6    Mary   4    2
7    Mary   4    2
8    Mary   8    3

If your dataframe is not ordered yet, you should order it first with df1 <- df1[order(df1$SNR),] for this method to give the correct result.

Upvotes: 2

989
989

Reputation: 12937

Using aggregate and factor in base R:

ag <- aggregate(SNR~Subject, df, function(x) as.numeric(factor(x)))
df$rank <- c(t(ag[,-1]))

  Subject SNR rank
1    John  -4    1
2    John  -4    1
3    John   0    2
4    John   4    3
5    Mary   0    1
6    Mary   4    2
7    Mary   4    2
8    Mary   8    3

Upvotes: 2

infominer
infominer

Reputation: 2001

library(dplyr)    
df %>%
     arrange(Subject, SNR) %>%
     group_by(Subject) %>%
     mutate(rank=dense_rank(SNR))

of course credit to @rich-scriven for mentioning dense_rank()

Upvotes: 1

Related Questions