Reputation: 45
I am working with R on cross-section data and having a problem when grouping the data under conditions. The problem can be seen clearly from a small part of my huge database as following. I would like to calculate the Average (Distance) under the conditions of same Province, District and Commune.
Province District Commune Distance
101 15 3 15
101 15 3 5
101 15 3 7
101 15 9 1
101 15 9 7
102 18 19 3
102 18 19 10
103 16 22 5
103 16 22 6
The expected results would be the following (divided by each specific commune for each district and each province):
Province District Commune Distance
101 15 3 average
101 15 9 average
102 18 19 average
103 16 22 average
Upvotes: 2
Views: 61
Reputation: 4921
I think you are searching for the following:
library(plyr)
ddply(df, .(Province, District, Commune), summarize, val = mean(Distance))
Upvotes: 1
Reputation: 886938
Try
library(dplyr)
df1 %>%
group_by(Province, District, Commune) %>%
summarise(Distance=mean(Distance))
Or
aggregate(Distance~., df1, mean)
Or
library(data.table)
setDT(df1)[, list(Distance=mean(Distance)), .(Province, District, Commune)]
Upvotes: 1