Ahmed Youssif
Ahmed Youssif

Reputation: 1

How the Mode Formula in R works...?

I found this formula can be used for R to find the MODE for any column in a Dataset, ho does it work...??

names(sort(-table(mtcars$wt)))[1]

it can be used to find the MODE for wt colimn.

I need to understand this formula.

Upvotes: 0

Views: 26

Answers (1)

r2evans
r2evans

Reputation: 160617

To learn what the whole expression does, you should step through each component.

  • table tabulates (counts) the occurrences for each unique value within $wt:

    table(mtcars$wt)
    # 1.513 1.615 1.835 1.935  2.14   2.2  2.32 2.465  2.62  2.77  2.78 2.875  3.15  3.17  3.19 3.215 3.435  3.44  3.46 
    #     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     3     1 
    #  3.52  3.57  3.73  3.78  3.84 3.845  4.07  5.25 5.345 5.424 
    #     1     2     1     1     1     1     1     1     1     1 
    

    Note that the original "value" of $wt is stored as the names within the returned vector.

  • sort(-table(.)) then brings the most-frequent value to the front (left) and least-frequent value to the back (right).

    sort(-table(mtcars$wt))
    #  3.44  3.57 1.513 1.615 1.835 1.935  2.14   2.2  2.32 2.465  2.62  2.77  2.78 2.875  3.15  3.17  3.19 3.215 3.435 
    #    -3    -2    -1    -1    -1    -1    -1    -1    -1    -1    -1    -1    -1    -1    -1    -1    -1    -1    -1 
    #  3.46  3.52  3.73  3.78  3.84 3.845  4.07  5.25 5.345 5.424 
    #    -1    -1    -1    -1    -1    -1    -1    -1    -1    -1 
    

    Sorting on the negative of it is equivalent to sort(table(.), decreasing=TRUE).

  • names(..) will return the original wt values from this vector, sorted in the decreasing order of their counts. Adding [1] to that returns only the first of the name.

Long-story-short: this returns the first value within mtcars$wt that occurs the most. FYI, if there are multiple values with the same count, this code will not indicate that condition.

Upvotes: 2

Related Questions