Reputation: 1031
I need to compute weighted Mann Whitney U test results a few hundred times. Each iteration involves is a two-sample test for differences between two groups. I can't figure out how to get the existing function to handle missing values without dynamically deleting cases.
The data for a few of the comparisons are here, in a data frame I call dat
. All variables with numbers in this sheet are numeric in type.
Here's how I call the sjstats::mannwhitney()
function:
mannwhitney(dat, measure1, group)
When I do so, I get the following error:
Error in `[[<-.data.frame`(`*tmp*`, "grp1.label", value = character(0)) :
replacement has 0 rows, data has 1
I suspect this is because of the missing value in the 212th observation of measure1
. But wrapping the vector names in na.omit()
or !is.na()
don't address the problem, perhaps because doing so still results in a data frame where the number of non-NA values of group
are greater than the number of non-NA values in measure1
.
Any thoughts on how I could incorporate dynamic NA
handling into the function call?
Upvotes: 1
Views: 458
Reputation: 46958
I am not sure what class your group column is, but if I do it like this:
library(sjstats)
dat = read.csv("question - Sheet1.csv")
str(dat)
'data.frame': 301 obs. of 5 variables:
$ measure1 : num 2 1.6 2.2 2.7 1.8 1.8 4 4 3.9 -3.7 ...
$ measure2 : num 0.9 0.1 0 0.4 -1 -1.3 2.1 0 -1.1 -3.9 ...
$ measure3 : num 1.1 1.1 2.2 1.2 1.9 1.2 0 3 1.9 -3.8 ...
$ measurre4: num 2 2 2 3 3 2 3 4 3 2.36 ...
$ group : int 0 0 0 0 0 0 0 0 0 0 ...
I get:
mannwhitney(dat, measure1, group)
Error in `[[<-.data.frame`(`*tmp*`, "grp1.label", value = character(0)) :
replacement has 0 rows, data has 1
Factor your group:
dat$group = factor(dat$group)
mannwhitney(dat, measure1, group)
# Mann-Whitney-U-Test
Groups 1 = 0 (n = 110) | 2 = 1 (n = 190):
U = 16913.000, W = 10808.000, p = 0.621, Z = 0.495
effect-size r = 0.029
rank-mean(1) = 153.75
rank-mean(2) = 148.62
Reading the code, the bug comes from this:
labels <- sjlabelled::get_labels(grp, attr.only = F, values = NULL,
non.labelled = T)
If your group is numeric, it doesn't have attributes and hence you get no labels:
sjlabelled::get_labels(0:1)
NULL
sjlabelled::get_labels(factor(0:1))
[1] "0" "1"
Upvotes: 1