Reputation: 3617
I have the following problem in R:
Lets assume the following data frame:
a b c d e
1 1 1 1 1 15.5
2 1 1 1 2 8.3
3 1 1 2 1 12.4
4 1 1 2 2 3.2
...
I want to apply a function f(x,y)
to the numbers from column e
, where x and y are drawn from the two rows which have the same values in all columns except d
(and e
of course).
The output should be a new data frame, in which column d
is dropped (as the "merge" made that column irrelevant) and column e
is the result of the applied function.
So in the example above, assuming f(x,y)
is addition, the new data frame should look like this:
a b c e
1 1 1 1 23.8
3 1 1 2 15.6
...
What i have tried so far looks something like the following, which feels very inelegant:
data.d1 <- subset(data, d==1)
for (index in 1:nrow(data.d1))
row1 <- data.d1[index,]
row2 <- data[data$a==row1$a & data$b==row1$b & data$c==row1$c & data$d==2,]
data[index,"e"] <- f(row1$e, row2$e)
}
data <- data[-match(c("d"), names(data))]
Does somebody with have a more clean solution, using apply()
and the like?
Thanks in advance!
Upvotes: 0
Views: 285
Reputation: 66842
here is examples:
d> ddply(x, .(a, b, c), summarize, e = sum(e))
a b c e
1 1 1 1 23.8
2 1 1 2 15.6
d> aggregate(e~a+b+c, sum, data = x)
a b c e
1 1 1 1 23.8
2 1 1 2 15.6
ddply
is a function in plyr
package.
Upvotes: 4