Reputation: 41
developers!
I have encountered an error message
Error in if (obs <= ei) 2 * pv else 2 * (1 - pv) : missing value where TRUE/FALSE needed
stopping me to get the value from Moran's I function from ape package. Here is what I did:
library(ape)
nrstp <- data.frame(
X = c(300226.9, 300224.6, 300226.4, 300226.1, 300224.0, 300226.4, 300225.7, 300226.4, 300226.1, 300226.4, 300226.3, 300226.3, 300227.1),
Y = c(5057949, 5057952, 5057950, 5057950, 5057956, 5057950, 5057950, 5057950, 5057950, 5057950, 5057950, 5057950, 5057949),
V3 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0))
nrstp = data.frame(nrstp)
dist = as.matrix(dist(cbind(nrstp$X, nrstp$Y)))
invdist = 1/dist
invdist[is.infinite(invdist)] <- 0
moranI = Moran.I(nrstp$V3, invdist)
The intention of this code is to calculate Moran's I from a series of point to check spatial autocorrelation. So far, this seems to be the only function working for Moran's I in R. After a few testing (I have thousands groups of points), this error seems only happen to the input vector having only one value (I tried other numbers than 0, it still raise this error).
Could someone help me improve this code? Or are their better suggestion to calculate Moran's I or test spatial autocorrelation from linestring (those point groups are origin point of one linestring and the closest points from other linestring within 10 meter buffer of such origin point)?
Thank you ahead for any help!
Upvotes: 4
Views: 1835
Reputation: 581
The problem is that your x
are all the same value. If you look in the code from Abdur Rohman the calculation of the function is
m <- mean(x)
y <- x - m
cv <- sum(weight * y %o% y)
v <- sum(y^2)
obs <- (n/s) * (cv/v)
if all x
are the same than the mean of m <- mean(x)
is obviously the same value as all x
and y, v, obs
are 0.
For obs
you divide cv/v
which is NaN
So at least one value of x
should be different
Upvotes: 2
Reputation: 2944
The control flow choice if(condition) do something
requires that the value of condition
is not NA
.
In your case, obs <= ei
results in NA
. That's why the the error message missing value where TRUE/FALSE needed
is generated.
To understand how obs <= ei
results in NA
, you can check the details inside Moran.I
function:
Moran.I
function (x, weight, scaled = FALSE, na.rm = FALSE, alternative = "two.sided")
{
if (dim(weight)[1] != dim(weight)[2])
stop("'weight' must be a square matrix")
n <- length(x)
if (dim(weight)[1] != n)
stop("'weight' must have as many rows as observations in 'x'")
ei <- -1/(n - 1)
nas <- is.na(x)
if (any(nas)) {
if (na.rm) {
x <- x[!nas]
n <- length(x)
weight <- weight[!nas, !nas]
}
else {
warning("'x' has missing values: maybe you wanted to set na.rm = TRUE?")
return(list(observed = NA, expected = ei, sd = NA,
p.value = NA))
}
}
ROWSUM <- rowSums(weight)
ROWSUM[ROWSUM == 0] <- 1
weight <- weight/ROWSUM
s <- sum(weight)
m <- mean(x)
y <- x - m
cv <- sum(weight * y %o% y)
v <- sum(y^2)
obs <- (n/s) * (cv/v)
if (scaled) {
i.max <- (n/s) * (sd(rowSums(weight) * y)/sqrt(v/(n -
1)))
obs <- obs/i.max
}
S1 <- 0.5 * sum((weight + t(weight))^2)
S2 <- sum((apply(weight, 1, sum) + apply(weight, 2, sum))^2)
s.sq <- s^2
k <- (sum(y^4)/n)/(v/n)^2
sdi <- sqrt((n * ((n^2 - 3 * n + 3) * S1 - n * S2 + 3 * s.sq) -
k * (n * (n - 1) * S1 - 2 * n * S2 + 6 * s.sq))/((n -
1) * (n - 2) * (n - 3) * s.sq) - 1/((n - 1)^2))
alternative <- match.arg(alternative, c("two.sided",
"less", "greater"))
pv <- pnorm(obs, mean = ei, sd = sdi)
if (alternative == "two.sided")
pv <- if (obs <= ei)
2 * pv
else 2 * (1 - pv)
if (alternative == "greater")
pv <- 1 - pv
list(observed = obs, expected = ei, sd = sdi, p.value = pv)
}
<bytecode: 0x000001cd5e0715d0>
<environment: namespace:ape>
By assigning x = nrstp$V3
and weight = invdist
, you will get mean(x) = 0
. This results in y=0
, cv = 0
, v=0
, and finally obs = NaN
. Consequently,
obs <= ei
[1] NA
To overcome the problem, you need to ensure that each of obs
and ei
is not NA
. In your case, if mean(x)
is not zero, obs <= ei
will not be NA
. However, because I know nothing about this particular topic, I'm not sure whether non-zero mean(x)
is always the right solution.
Upvotes: 2