Kevin
Kevin

Reputation: 3239

R - Get a table of averages by a specific field

Background: In my problem, I have a data set where each row represents a particular dive, for a particular event, for a particular diver. Each with its own score, and judge.

What I want to get is the average score that a particular judge gave to each diver by country.

The Data: To illustrate what I am saying:

  Event Round    Diver Country Rank DiveNo Difficulty JScore                   Judge JCountry
1 M3mSB Final XIONG Ni     CHN    1      1        3.1    8.0 RUIZ-PEDREGUERA Rolando      CUB
2 M3mSB Final XIONG Ni     CHN    1      1        3.1    9.0             GEAR Dennis      NZL
3 M3mSB Final XIONG Ni     CHN    1      1        3.1    8.5           BOYS Beverley      CAN
4 M3mSB Final XIONG Ni     CHN    1      1        3.1    8.5           JOHNSON Bente      NOR
5 M3mSB Final XIONG Ni     CHN    1      1        3.1    8.5         BOUSSARD Michel      FRA
6 M3mSB Final XIONG Ni     CHN    1      1        3.1    8.5          CALDERON Felix      PUR

What I tried: which seems to work:

countries <- unique(x$Country[x$Judge==thisjudge])
AvgByCountry <- vector(mode = "numeric", length =length(countries))
for(i in 1:length(countries)){
  AvgByCountry[i] <- mean(x$JScore[x$Country[x$Judge==thisjudge]==countries[i]])
}
names(AvgByCountry) <- countries
AvgByCountry

The Question: I understand it may not be the best loop, but is there a better way to do this? I tried sub-setting, and a few other things, but none of which gave me what I wanted.

Upvotes: 0

Views: 74

Answers (1)

Jav
Jav

Reputation: 2313

Using data.tabe:

library(data.table)
set.seed(100)
DT <- data.table(X = rnorm(20),  "Country" = sample(c("US","UK"), 10, TRUE))

                 X Country
 1: -0.50219235      US
 2:  0.13153117      UK
 3: -0.07891709      UK
 4:  0.88678481      UK
 5:  0.11697127      UK
 6:  0.31863009      US
 7: -0.58179068      UK
 8:  0.71453271      UK
 9: -0.82525943      US
10: -0.35986213      US
11:  0.08988614      US
12:  0.09627446      UK
13: -0.20163395      UK
14:  0.73984050      UK
15:  0.12337950      UK
16: -0.02931671      US
17: -0.38885425      UK
18:  0.51085626      UK
19: -0.91381419      US
20:  2.31029682      US

DT[, Mean:=mean(X), by= 'Country']

          X Country       Mean


 1: -0.50219235      US 0.01104603
 2:  0.13153117      UK 0.17241456
 3: -0.07891709      UK 0.17241456
 4:  0.88678481      UK 0.17241456
 5:  0.11697127      UK 0.17241456
 6:  0.31863009      US 0.01104603
 7: -0.58179068      UK 0.17241456
 8:  0.71453271      UK 0.17241456
 9: -0.82525943      US 0.01104603
10: -0.35986213      US 0.01104603
11:  0.08988614      US 0.01104603
12:  0.09627446      UK 0.17241456
13: -0.20163395      UK 0.17241456
14:  0.73984050      UK 0.17241456
15:  0.12337950      UK 0.17241456
16: -0.02931671      US 0.01104603
17: -0.38885425      UK 0.17241456
18:  0.51085626      UK 0.17241456
19: -0.91381419      US 0.01104603
20:  2.31029682      US 0.01104603

Or as warmoverflow suggested using aggregate

 aggregate(X ~ Country, data = DT, mean)
  Country       X
1   UK     0.17241456
2   US     0.01104603

EDIT as Update to Comment:

library(data.table)
set.seed(100)
DT <- data.table(X = rnorm(20),  "Country" = sample(c("US","UK"), 10, TRUE), "Judge" = sample(c("James","Nick"), 10, TRUE)

aggregate(X ~ Country + Judge, data = DT, mean)

  Country Judge          X
1      UK James  0.1828624
2      US James  0.3045736
3      UK  Nick  0.1201754
4      US  Nick -0.8695368

Or data.table method

DT[, Mean:=mean(X), by= c('Country', 'Judge')]

Upvotes: 2

Related Questions