Reputation: 1
I found this formula can be used for R to find the MODE for any column in a Dataset, ho does it work...??
names(sort(-table(mtcars$wt)))[1]
it can be used to find the MODE for wt colimn.
I need to understand this formula.
Upvotes: 0
Views: 26
Reputation: 160617
To learn what the whole expression does, you should step through each component.
table
tabulates (counts) the occurrences for each unique value within $wt
:
table(mtcars$wt)
# 1.513 1.615 1.835 1.935 2.14 2.2 2.32 2.465 2.62 2.77 2.78 2.875 3.15 3.17 3.19 3.215 3.435 3.44 3.46
# 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 3 1
# 3.52 3.57 3.73 3.78 3.84 3.845 4.07 5.25 5.345 5.424
# 1 2 1 1 1 1 1 1 1 1
Note that the original "value" of $wt
is stored as the names within the returned vector.
sort(-table(.))
then brings the most-frequent value to the front (left) and least-frequent value to the back (right).
sort(-table(mtcars$wt))
# 3.44 3.57 1.513 1.615 1.835 1.935 2.14 2.2 2.32 2.465 2.62 2.77 2.78 2.875 3.15 3.17 3.19 3.215 3.435
# -3 -2 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
# 3.46 3.52 3.73 3.78 3.84 3.845 4.07 5.25 5.345 5.424
# -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
Sorting on the negative of it is equivalent to sort(table(.), decreasing=TRUE)
.
names(..)
will return the original wt
values from this vector, sorted in the decreasing order of their counts. Adding [1]
to that returns only the first of the name.
Long-story-short: this returns the first value within mtcars$wt
that occurs the most. FYI, if there are multiple values with the same count, this code will not indicate that condition.
Upvotes: 2