Reputation: 8247
I have a following dataframe in R
Serivce Codes
ABS RT
ABS RT
ABS TY
ABS DR
ABS DR
ABS DR
ABS DR
DEF RT
DEF RT
DEF TY
DEF DR
DEF DR
DEF DR
DEF DR
DEF TY
DEF SE
DEF SE
What I want is service wise code count in descending order
Serivce Codes Count
ABS DR 4
ABS RT 2
ABS TY 1
DEF DR 4
DEF RT 2
DEF TY 2
I am doing following in r
df%>%
group_by(Service,Codes) %>%
summarise(Count = n()) %>%
top_n(n=3,wt = Count) %>%
arrange(desc(Count)) %>%
as.data.frame()
But,it does not give me what is intended.
Upvotes: 8
Views: 10649
Reputation: 11
df%>% count(Service,Codes) %>% mutate(rank = dense_rank(desc(n))) %>% filter(rank < 5)
Upvotes: 1
Reputation: 38500
In base R, you can do this in two lines.
# get data.frame of counts by service-code pairs
mydf <- data.frame(table(dat))
# get top 3 by service
do.call(rbind, lapply(split(mydf, mydf$Serivce), function(x) x[order(-x$Freq)[1:3],]))
This returns
Serivce Codes Freq
ABS.1 ABS DR 4
ABS.3 ABS RT 2
ABS.7 ABS TY 1
DEF.2 DEF DR 4
DEF.4 DEF RT 2
DEF.6 DEF SE 2
In the first line use table
to get the counts, then convert to data.frame. In the second line, split by service, order by the negative values of order
and pull out the first three elements. Combine the results with do.call
.
Upvotes: 0
Reputation: 886938
We can try with count/arrange/slice
df1 %>%
count(Service, Codes) %>%
arrange(desc(n)) %>%
group_by(Service) %>%
slice(seq_len(3))
# A tibble: 6 x 3
# Groups: Service [2]
# Service Codes n
# <chr> <chr> <int>
#1 ABS DR 4
#2 ABS RT 2
#3 ABS TY 1
#4 DEF DR 4
#5 DEF RT 2
#6 DEF SE 2
In the OP's code, we need to arrange
by 'Service' too. As @Marius said in the comments, the top_n
will include more number of rows if there are ties. One option is to do a second grouping with 'Service' and slice
(as showed above) or after the grouping, we can filter
df1 %>%
group_by(Service,Codes) %>%
summarise(Count = n()) %>%
top_n(n=3,wt = Count) %>%
arrange(Service, desc(Count)) %>%
group_by(Service) %>%
filter(row_number() <=3)
Upvotes: 10