Jingming
Jingming

Reputation: 11

the meaning of cluster size in Cox process models in spatstat

for some tree wood, the conduits in cross sections clearly aggregate as clusters. it looks natural that the Cox process modeling in spatstat (r) could be fitted for the conduits point data, and the results include a estimated "Mean cluster size". I am not sure the meaning of this index, can I think it is the mean number of conduits in clusters of the whole conduit points data? code from an good example in the book is following:

    >fitM<-kppm(redwood~1, "MatClust")
    >fitM 
    #...    
    # Scale-0.08654
    # Mean cluster size: 2.525 points

in their book, author of the spatstat explain the mean cluster size as the offspring points number, which is dispered by parent points like plant seedlings. in my case, there are no such process happening: conduits are xylem cells developed from cambium cells from outside of the stem annual ring, they donnot disperse randomly. I would like to estimate the mean cluster size and cluster scale for my conduit distribution data, the Scale and Mean cluster size seems like what I want. however, the redwood data was different with mine in nature, I am not sure about the meaning of them in my data. futhermore, I am wondering, which model is suit for my context, NeymanScott, MatCluster, Thomas or others? any suggestion is appreciated. Jingming

Upvotes: 1

Views: 201

Answers (1)

Ege Rubak
Ege Rubak

Reputation: 4507

If you fit a parametric point process model such as a Thomas or Matern cluster process you are assuming the data is generated by a random process that generates a random number of clusters with a random number of points in each cluster. The location of the points around each cluster center is also random. The parameter kappa controls the expected number of clusters, mu controls the expected number of points in a cluster and scale controls the extend of the cluster. The type of process (Thomas, Matern or others) determines the distribution within the cluster. My best suggestion is to do simulation experiments to understand these different types of processes and see if they are appropriate for your needs.

For example on average 10 clusters in the unit square with on average 5 points in each and a short spatial extend (scale=0.01) of the cluster gives you fairly well-defined tight clusters:

library(spatstat)
set.seed(42)
sim1 <- rThomas(kappa = 10, mu = 5, scale = 0.01, nsim = 9)
plot(sim1, main = "")

For example on average 10 clusters in the unit square with on average 5 points in each and a bigger spatial extend (scale=0.05) of the cluster gives a less clear picture where it is hard to see the clusters:

sim2 <- rThomas(kappa = 10, mu = 5, scale = 0.05, nsim = 9)
plot(sim2, main = "")

In conclusion: Experiment with simulation and remember to do many simulations of each experiment rather than just one, which can be vey misleading.

Upvotes: 1

Related Questions