Viet Tran Tuan
Viet Tran Tuan

Reputation: 45

Group data under conditions

I am working with R on cross-section data and having a problem when grouping the data under conditions. The problem can be seen clearly from a small part of my huge database as following. I would like to calculate the Average (Distance) under the conditions of same Province, District and Commune.

Province    District    Commune  Distance
101           15           3      15
101           15           3       5
101           15           3       7
101           15           9       1
101           15           9       7
102           18          19       3
102           18          19       10
103           16          22       5
103           16          22       6

The expected results would be the following (divided by each specific commune for each district and each province):

Province    District    Commune    Distance
101           15           3       average
101           15           9       average
102           18           19      average
103           16           22      average

Upvotes: 2

Views: 61

Answers (2)

Ruthger Righart
Ruthger Righart

Reputation: 4921

I think you are searching for the following:

library(plyr)
ddply(df, .(Province, District, Commune), summarize, val = mean(Distance)) 

Upvotes: 1

akrun
akrun

Reputation: 886938

Try

library(dplyr)
df1 %>% 
    group_by(Province, District, Commune) %>% 
    summarise(Distance=mean(Distance))

Or

aggregate(Distance~., df1, mean)

Or

 library(data.table)
 setDT(df1)[, list(Distance=mean(Distance)), .(Province, District, Commune)]

Upvotes: 1

Related Questions