Reputation: 407
Here dataset in use is genotype from the cran package,MASS.
> names(genotype)
[1] "Litter" "Mother" "Wt"
> str(genotype)
'data.frame': 61 obs. of 3 variables:
$ Litter: Factor w/ 4 levels "A","B","I","J": 1 1 1 1 1 1 1 1 1 1 ...
$ Mother: Factor w/ 4 levels "A","B","I","J": 1 1 1 1 1 2 2 2 3 3 ...
$ Wt : num 61.5 68.2 64 65 59.7 55 42 60.2 52.5 61.8 ...
This was the given question from a tutorial: Exercise 6.7. Find the heaviest rats born to each mother in the genotype() data.
tapply, whence split by factor genotype$Mother gives:
> tapply(genotype$Wt, genotype$Mother, max)
A B I J
68.2 69.8 61.8 61.0
Also:
> out <- tapply(genotype$Wt, genotype[,1:2],max)
> out
Mother
Litter A B I J
A 68.2 60.2 61.8 61.0
B 60.3 64.7 59.0 51.3
I 68.0 69.8 61.3 54.5
J 59.0 59.5 61.4 54.0
First tapply gives the heaviest rats from each mother , and second (out) gives a table that allows me identify which type of litter of each mother was heaviest. Is there another way to match which Litter is has the most weight for each mother, for instance if the 2 dim table is real large.
Upvotes: 0
Views: 82
Reputation: 887501
We could use data.table
. We convert the 'data.frame' to 'data.table' (setDT(genotype)
). Create the index using which.max
and subset the rows of the dataset grouped by the 'Mother'.
library(data.table)#v1.9.5+
setDT(genotype)[, .SD[which.max(Wt)], by = Mother]
# Mother Litter Wt
#1: A A 68.2
#2: B I 69.8
#3: I A 61.8
#4: J A 61.0
If we are only interested in the max
of 'Wt' by 'Mother'
setDT(genotype)[, list(Wt=max(Wt)), by = Mother]
# Mother Wt
#1: A 68.2
#2: B 69.8
#3: I 61.8
#4: J 61.0
Based on the last tapply
code showed by the OP, if we need similar output, we can use dcast
from the devel version of 'data.table'
dcast(setDT(genotype), Litter ~ Mother, value.var='Wt', max)
# Litter A B I J
#1: A 68.2 60.2 61.8 61.0
#2: B 60.3 64.7 59.0 51.3
#3: I 68.0 69.8 61.3 54.5
#4: J 59.0 59.5 61.4 54.0
library(MASS)
data(genotype)
Upvotes: 3
Reputation: 5152
From stats:
aggregate(. ~ Mother, data = genotype, max)
or
aggregate(Wt ~ Mother, data = genotype, max)
Upvotes: 1