Reputation: 149
What do you do if you wanted to find the maximum frequency for each columns in a dataframe and return the factors, categories, and frequency?
So I have the code as follows:
dfreqcommon = data.frame()
for (i in 1:ncol(diamonds)){
dfc = data.frame(t(table(diamonds[,i])))
dfc$Var1 = names(diamonds)[i]
dfreqcommon = rbind(dfreqcommon, dfc)
}
names(dfreqcommon) = c("Factors","Categories","Frequency")
dfreqcommon
But this seemed to return all factors, categories, and frequency. I just wanted the maximum frequency for each factors and get its categories as well. I tried to change dfc to
dfc = data.frame(max(t(table(diamonds[,i]))))
But it doesn't show the categories. Is there any way to fix this?
Upvotes: 1
Views: 666
Reputation: 24074
Another way, with base
R:
library(ggplot2) # only to get the diamonds data.frame
data.frame(Factors=colnames(diamonds),
t(sapply(diamonds, # apply following function to each column
function(x) {
t_x <- sort(table(x), decreasing=TRUE) # get the frequencies and sort them in decreasing order
list(Categories=names(t_x)[1], # name of the value with highest frequency
Frequency=t_x[1]) # highest frequency
})))
# Factors Categories Frequency
#carat carat 0.3 2604
#cut cut Ideal 21551
#color color G 11292
#clarity clarity SI1 13065
#depth depth 62 2239
#table table 56 9881
#price price 605 132
#x x 4.37 448
#y y 4.34 437
#z z 2.7 767
Upvotes: 2
Reputation: 4534
Do you mean you want a result something like this? The following example shows how you could get the most frequently occurring value for each column in the ggplot2::diamonds
dataset.
library(dplyr)
library(tidyr)
ggplot2::diamonds %>%
mutate_all(as.character) %>%
gather(varname, value) %>%
count(varname, value) %>%
group_by(varname) %>%
arrange(desc(n), .by_group = TRUE) %>%
slice(1)
#> # A tibble: 10 x 3
#> # Groups: varname [10]
#> varname value n
#> <chr> <chr> <int>
#> 1 carat 0.3 2604
#> 2 clarity SI1 13065
#> 3 color G 11292
#> 4 cut Ideal 21551
#> 5 depth 62 2239
#> 6 price 605 132
#> 7 table 56 9881
#> 8 x 4.37 448
#> 9 y 4.34 437
#> 10 z 2.7 767
Upvotes: 1