Peutch
Peutch

Reputation: 773

Compute the number of distinct values in col2 for each distinct value in col1 in R

I have a dataframe like this:

df <- data.frame(
          SchoolID=c("A","A","B","B","C","D"),
          Country=c("XX","XX","XX","YY","ZZ","ZZ"))

which gives me this data:

    SchoolID    Country
1   A           XX
2   A           XX
3   B           XX
4   B           YY
5   C           ZZ
6   D           ZZ

I would like to know for each SchoolID whether Country is uniquely assigned, by looking, for each distinct value of SchoolID, the number of distinct values of Country. So I would like to obtain this kind of table:

SchoolID   NumberOfCountry
A          1
B          2
C          1
D          1

Upvotes: 2

Views: 252

Answers (3)

David Arenburg
David Arenburg

Reputation: 92292

aggregate(Country ~ SchoolID, df, function(x) length(unique(x)))

Or

tapply(df$Country, df$SchoolID, function(x) length(unique(x)))

Or

library(data.table) 
setDT(df)[, .(NumberOfCountry = length(unique(Country))), by = SchoolID]

Or with v >1.9.5

setDT(df)[, .(NumberOfCountry = uniqueN(Country)), by = SchoolID]

Or

library(dplyr)
df %>% 
  group_by(SchoolID) %>% 
  summarise(NumberOfCountry = n_distinct(Country))

Upvotes: 3

Alex Reynolds
Alex Reynolds

Reputation: 96937

One approach, which does not rely on third-party libraries:

> as.data.frame(rowSums(table(df[!duplicated(df), ]), na.rm=T))
  rowSums(table(df[!duplicated(df), ]), na.rm = T)
A                                                1
B                                                2
C                                                1
D                                                1

Upvotes: 1

ans_Learning.....
ans_Learning.....

Reputation: 47

try this..

select School,count(Country)
from(
select distinct School,Country
from tbl_stacko) temp
group by School

Upvotes: -1

Related Questions