tobigue
tobigue

Reputation: 3617

Apply function to column with value from other row

I have the following problem in R:

Lets assume the following data frame:

    a    b    c    d    e
1   1    1    1    1    15.5
2   1    1    1    2    8.3
3   1    1    2    1    12.4
4   1    1    2    2    3.2
...

I want to apply a function f(x,y) to the numbers from column e, where x and y are drawn from the two rows which have the same values in all columns except d (and e of course).

The output should be a new data frame, in which column d is dropped (as the "merge" made that column irrelevant) and column e is the result of the applied function.

So in the example above, assuming f(x,y) is addition, the new data frame should look like this:

    a    b    c     e
1   1    1    1     23.8
3   1    1    2     15.6
...

What i have tried so far looks something like the following, which feels very inelegant:

data.d1 <- subset(data, d==1)
for (index in 1:nrow(data.d1))
  row1 <- data.d1[index,]
  row2 <- data[data$a==row1$a & data$b==row1$b & data$c==row1$c & data$d==2,]
  data[index,"e"] <- f(row1$e, row2$e)
}
data <- data[-match(c("d"), names(data))]

Does somebody with have a more clean solution, using apply() and the like? Thanks in advance!

Upvotes: 0

Views: 285

Answers (1)

kohske
kohske

Reputation: 66842

here is examples:

d> ddply(x, .(a, b, c), summarize, e = sum(e))
  a b c    e
1 1 1 1 23.8
2 1 1 2 15.6
d> aggregate(e~a+b+c, sum, data = x)
  a b c    e
1 1 1 1 23.8
2 1 1 2 15.6

ddply is a function in plyr package.

Upvotes: 4

Related Questions