Reputation: 1285
Here are the observations of two individuals of my dataset.
data=structure(list(id = c(2L, 2L, 2L, 3L, 3L, 3L), trt = c(1L, 1L,
1L, 1L, 1L, 1L), status = c(0L, 0L, 0L, 2L, 2L, 2L), stage = c(3L,
3L, 3L, 4L, 4L, 4L), spiders = c(1L, 1L, 1L, 0L, 1L, 0L), sex = structure(c(2L,
2L, 2L, 1L, 1L, 1L), .Label = c("m", "f"), class = "factor"),
hepato = c(1L, 1L, 1L, 0L, 1L, 0L), edema = c(0, 0, 0, 0.5,
0, 0.5), ascites = c(0L, 0L, 0L, 0L, 0L, 0L)), row.names = c(NA,
-6L), class = "data.frame")
I want to calculate the the statistical mode for each individual after grouping by id
. I used this code below:
library(dplyr)
library(modeest)
data%>%
group_by(id)%>%mutate(edema2=mlv(edema))
And I get an error message when calculating the mode, while this method work well with other statistical parameters such as mean
, sd
, min
, max
....
Upvotes: 0
Views: 154
Reputation: 388797
The warnings that you are getting are suggesting two things.
You have not specified what method
to choose so default method 'shorth' is used.
It is suggesting that there is a tie in selection of Mode value.
Alternatively, why not use the Mode
function from here :
Mode <- function(x) {
ux <- unique(x)
ux[which.max(tabulate(match(x, ux)))]
}
To apply by group you can use it with dplyr
as :
library(dplyr)
data%>% group_by(id)%>% mutate(edema2= Mode(edema))
# id trt status stage spiders sex hepato edema ascites edema2
# <int> <int> <int> <int> <int> <fct> <int> <dbl> <int> <dbl>
#1 2 1 0 3 1 f 1 0 0 0
#2 2 1 0 3 1 f 1 0 0 0
#3 2 1 0 3 1 f 1 0 0 0
#4 3 1 2 4 0 m 0 0.5 0 0.5
#5 3 1 2 4 1 m 1 0 0 0.5
#6 3 1 2 4 0 m 0 0.5 0 0.5
Upvotes: 4