Reputation: 1345
Sorry if the description is vague - I'm extremely new to R and finding it hard to visualise exactly what I want to do. Suppose I have some data:
dat <- read.table(text = '
A B C
"Mike" 1 1
"Mike" 1 17
"Mike" 1 3
"Mike" 2 4
"Mike" 3 18
"Simon" 1 2
"Simon" 1 25
"Simon" 2 12
"Simon" 2 182
"Simon" 2 6', header=TRUE)
... etc.
Suppose I want to know the number of names (A column) that have 3 entries where B = 1, and the number of names that have 3 entries where B = 2, and so on?
In the example above, "Mike" has 3 entries where B = 1, but not B = 2 or B = 3. "Simon" has 3 entries for B = 2, and so on. It's crossing entries in the data, which I've not done yet in R, and I'm not sure how best to approach it.
Upvotes: 2
Views: 86
Reputation: 263411
Assuming this is in a data.frame named dat
:
> tapply(dat$B, dat$A, function(x) names(table(x))[table(x)==3] )
Mike Simon
"1" "2"
Your comment suggest you wanted a tabular display. So perhaps this would also be of interest:
> xtabs( ~ A + B, dat)
B
A 1 2 3
Mike 3 1 1
Simon 2 3 0
And there are methods of working with that matrix that are sometimes what is needed:
> which( xtabs( ~ A + B, dat) == 3, arr.ind=TRUE )
row col
Mike 1 1
Simon 2 2
Upvotes: 3
Reputation: 162401
I believe this is what you're after (but realize the code's terribly dense for an R newbie, and possibly even for not-so-newbies):
tab <- table(dat[1:2])
m <- max(tab)
apply(rbind(tab, m), 2, tabulate) - c(rep(0, m-1), 1)
# 1 2 3
# [1,] 0 1 1
# [2,] 1 0 0
# [3,] 1 1 0
Values of B are along the top while frequencies (number of people having that count of B=1
, B=2
, and B=3
) are along the side.
Upvotes: 1