Reputation: 773
I have a dataframe like this:
df <- data.frame(
SchoolID=c("A","A","B","B","C","D"),
Country=c("XX","XX","XX","YY","ZZ","ZZ"))
which gives me this data:
SchoolID Country
1 A XX
2 A XX
3 B XX
4 B YY
5 C ZZ
6 D ZZ
I would like to know for each SchoolID whether Country is uniquely assigned, by looking, for each distinct value of SchoolID, the number of distinct values of Country. So I would like to obtain this kind of table:
SchoolID NumberOfCountry
A 1
B 2
C 1
D 1
Upvotes: 2
Views: 252
Reputation: 92292
aggregate(Country ~ SchoolID, df, function(x) length(unique(x)))
Or
tapply(df$Country, df$SchoolID, function(x) length(unique(x)))
Or
library(data.table)
setDT(df)[, .(NumberOfCountry = length(unique(Country))), by = SchoolID]
Or with v >1.9.5
setDT(df)[, .(NumberOfCountry = uniqueN(Country)), by = SchoolID]
Or
library(dplyr)
df %>%
group_by(SchoolID) %>%
summarise(NumberOfCountry = n_distinct(Country))
Upvotes: 3
Reputation: 96937
One approach, which does not rely on third-party libraries:
> as.data.frame(rowSums(table(df[!duplicated(df), ]), na.rm=T))
rowSums(table(df[!duplicated(df), ]), na.rm = T)
A 1
B 2
C 1
D 1
Upvotes: 1
Reputation: 47
try this..
select School,count(Country)
from(
select distinct School,Country
from tbl_stacko) temp
group by School
Upvotes: -1