Luis Mompó
Luis Mompó

Reputation: 45

How to achieve the most repeated values or names to show in a data frame

I have an easy question related to the library dplyr in R.

My actual data frame looks like this:

Players <- data.frame(Group = c("A", "A", "A", "A", "B", "B", "B", "C","C","C"), Players= c("Jhon", "Jhon", "Jhon", "Charles", "Mike", "Mike","Carl", "Max", "Max","Max"))

:

   Group Players
      A    Jhon
      A    Jhon
      A    Jhon
      A  Charles
      B    Mike
      B    Mike
      B    Carl
      C     Max
      C     Max
      C     Max

And I would like to get another data frame with the players more repeated of each group and how many times are they listed. So I would like to get this data frame:

Group Players TimesListed

A    Jhon      3
B    Mike      2
B    Max       3

I have tried this:

    Station <- Players %>% group_by(Group,Players) %>% 
        summarise(TimesListed=length(Players)) %>% 
        summarise(TimesListed=max(TimesListed))

But I get a data frame without the names of the players like this:

   Group TimesListed

1      A           3
2      B           2
3      C           3

Any idea? Thank you!

Upvotes: 1

Views: 87

Answers (3)

www
www

Reputation: 39154

For completeness, here is a solution using .

library(data.table)

setDT(Players)

Players[, .(TimesListed = .N), by = .(Group, Players)][
  , .SD[which.max(TimesListed)], by = Group]
#    Group Players TimesListed
# 1:     A    Jhon           3
# 2:     B    Mike           2
# 3:     C     Max           3

The above solution will return the first row with maximum in TimesListed. If we want to return all the rows equal to the maximum, we can do the following. In this case, the two solutions lead to the same results.

Players[, .(TimesListed = .N), by = .(Group, Players)][
  , .SD[TimesListed == max(TimesListed)], by = Group]
#    Group Players TimesListed
# 1:     A    Jhon           3
# 2:     B    Mike           2
# 3:     C     Max           3

Upvotes: 0

Onyambu
Onyambu

Reputation: 79238

You can use aggregate function in base R:

aggregate(.~Group,dat,function(x)max(table(x)))
  Group Players
1     A       3
2     B       2
3     C       3

Upvotes: 1

tyluRp
tyluRp

Reputation: 4768

This should get you what you want:

library(dplyr)

Players %>% 
  group_by(Group) %>% 
  count(Players) %>% 
  top_n(1, n)

# A tibble: 3 x 3
# Groups:   Group [3]
   Group Players     n
  <fctr>  <fctr> <int>
1      A    Jhon     3
2      B    Mike     2
3      C     Max     3

You could do the following to convert the factors to characters:

Players[] <- lapply(Players, as.character)

And if you need to change variable n to TimesListed, add the following to the end of the chain:

rename(TimesListed = n)

Upvotes: 1

Related Questions