lpryor
lpryor

Reputation: 363

Construct a transform statement on the fly

I've got a dataframe with several (numeric) columns, and want to make a new dataframe whose columns are the ranks of the originals.

> df <- data.frame(cbind(id=LETTERS[1:10],  
wheat=c(123,234,345,456,678,987,876,654,432,321),barley=c(135,975,246,864,357,753,468,642,579,531)))
> df
   id wheat barley
1   A   123    135
2   B   234    975
3   C   345    246
4   D   456    864
5   E   678    357
6   F   987    753
7   G   876    468
8   H   654    642
9   I   432    579
10  J   321    531
> rankeddf <- transform(df, wheat=rank(wheat), barley=rank(barley))
> rankeddf
   id wheat barley
1   A     1      1
2   B     2     10
3   C     4      2
4   D     6      9
5   E     8      3
6   F    10      8
7   G     9      4
8   H     7      7
9   I     5      6
10  J     3      5

The thing is, the number and names of the columns vary. I have a vector that specifies them:

cols <- c("wheat", "barley")

How can I construct the transform statement on the fly? Or even loop through the cols vector, applying a transform statement once on each iteration? I'm guessing the answer is going to have something to do with eval or evalq, but I haven't quite got my head around them yet. For instance,

> rankeddf2 <- df
> for (col in cols) {rankeddf2 <- transform(rankeddf2, evalq(paste(col,"=rank(",col,")",sep="")))}
> rankeddf2
   id wheat barley
1   A   123    135
2   B   234    975
3   C   345    246
4   D   456    864
5   E   678    357
6   F   987    753
7   G   876    468
8   H   654    642
9   I   432    579
10  J   321    531

doesn't do the trick.

Alternatively, is there another way of doing this?

Upvotes: 3

Views: 136

Answers (2)

Gavin Simpson
Gavin Simpson

Reputation: 174813

I like to think of transform() and the related with() and within() as syntactic sugar that are useful at the top-level interactively but quite often subsetting and replacement via '['(), '[<-'() et al are more easy to use for jobs such as this:

> df2 <- df ## copy
> df2[, cols] <- apply(df[, cols], 2, rank)
> df2
   id wheat barley
1   A     1      1
2   B     2     10
3   C     4      2
4   D     6      9
5   E     8      3
6   F    10      8
7   G     9      4
8   H     7      7
9   I     5      6
10  J     3      5

'['() and '[<-'() already do what you want so you are trying to force transform() do something that is already implemented much more easily with the subsetting and replacement functions.

Upvotes: 4

Andrie
Andrie

Reputation: 179428

You can do this by using lapply and rank():

as.data.frame(lapply(df[, cols], rank))
   wheat barley
1      1      1
2      2     10
3      4      2
4      6      9
5      8      3
6     10      8
7      9      4
8      7      7
9      5      6
10     3      5

OK, so in the process you lose the first column, but that's easy to add back:

data.frame(id=df[[1]], lapply(df[, cols], rank))
   id wheat barley
1   A     1      1
2   B     2     10
3   C     4      2
4   D     6      9
5   E     8      3
6   F    10      8
7   G     9      4
8   H     7      7
9   I     5      6
10  J     3      5

Upvotes: 6

Related Questions