Reputation: 53
I hope the position distribution of NA is uniform in the vector (length = 30, NA < 6 ).
This one length is 30, 4 NA. It's easy to see these NA not uniform, mainly at left.
vector_x <- c(NA,3, NA, 1, NA, 5, 6, 7, 7, 9, 0, 2, 12, 324, 54,23, 12, 324, 122, 23, 324, 332, 45, 78, 32, 12, 342, 95, 67, NA)
But I have no idea about use which kind of statistic or test to discribe. Then I can quantitative screening by a cutoff.
Now, I have two preliminary thoughts.
To simplify the solution, all NA seemed as 0 and all number seemed as 1, to see the distribution.
Or I get the index of NA, to do variance analysis about c(1, 3, 5, 30)
Thanks for your any suggestions!
Upvotes: 0
Views: 99
Reputation: 13591
You want to perform a Mann-Whitney U test or Wilcoxon rank-sum test (which is more descriptive of what it's doing)
This is easy to do with your data
which(is.na(v))
# [1] 1 3 5 30
which(!is.na(v))
# [1] 2 4 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
wilcox.test(which(is.na(v)), which(!is.na(v)))
# Wilcoxon rank sum test
# data: which(is.na(v)) and which(!is.na(v))
# W = 29, p-value = 0.1766
# alternative hypothesis: true location shift is not equal to 0
Check that wilcox.test works the way we expect with
wilcox.test(1:5, 6:10) # low p value
wilcox.test(seq(1,10,2), seq(2,10,2)) # high p value
Upvotes: 2
Reputation: 887741
If we need the index of NA
elements, use is.na
to convert to a logical
vector, then with which
returns the numeric index where it is TRUE
which(is.na(vector_x))
#[1] 1 3 5 30
Or to convert to a binary vector where 0 represents NA and 1 for other values
as.integer(!is.na(vector_x))
Upvotes: 1