Reputation: 4470
I have 4 columns in a data frame
a <- data.frame(a=c(1,2,3,4), b=c(4,5,6,7), c=c(7,6,5,4), d=c(8,4,3,2))
I want to average first two columns and last two columns to get one data frame with two columns of same nrows with average of first two columns and last two columns
expected output:
5 15
7 10
9 8
11 6
Upvotes: 0
Views: 618
Reputation: 4941
To reproduce your output (which is sum, not mean):
library(plyr)
ddply(a, .(), summarise, first=a+b, second=c+d)[,-1]
It produces:
first second
1 5 15
2 7 10
3 9 8
4 11 6
To make data.frame
with averages:
ddply(a, .(), summarise, first=(a+b)/2, second=(c+d)/2)[,-1]
Output is:
first second
1 2.5 7.5
2 3.5 5.0
3 4.5 4.0
4 5.5 3.0
If you don't know columns' names code can be modified like this:
ddply(a, .(), summarise, first=a[,1]+a[,2], second=a[,3]+a[,4])[,-1]
Here you access columns by its order. Alternatively, you can simply run names(a) <- letters[1:4]
prior to ddply()
.
ddply
is very flexible function, you can specify grouping variables as second argument and get grouped results. But if the case is as simple as in the question you can call summarise
directly:
summarise(a, first=a+b, second=c+d) # if you know columns' names
summarise(a, first=a[,1]+a[,2], second=a[,3]+a[,4]) # if you don't know columns' names
Upvotes: 1