Training and simulating a spatstat ppm using multiple datasets

Question

Disclaimer: I'm very new to spatstat and spatial point modeling in general... please excuse my naivete.

I have recently tried using spatstat to fit and simulate spatial point patterns related to weather phenomenon where the spatial pattern represents a set of eye-witness reports (for example, reports of hail occurrence) and the observational window and covariate is based on some meteorological parameter (eg. the window is area where moisture is at least X, and then the moisture variable is additionally passed as a covariate when training the model).

moistureMask = owin(mask=moisture>X)
moistureVar = im(moisture)

obsPPP = ppp(x=obsX,y=obsY,window=moistureMask)
myModel = ppm(obsPPP ~ moistureVar)

### then simulate
mySim = simulate(myModel,nsim=10)

My questions are the following:

Is it possible (or more importantly, even valid), to take a ppm trained on one day with a specific moisture variable and mask, and apply it to another day with a different moisture value and mask. I had considered using the update function to switch out the window and covariate fields from the trained model, but haven't actually tried it yet. If the answer is yes... its a little unclear to me how to actually do this, programmatically
Is it it possible to do an online update of the ppm with additional data. For example, train the model on data from different days (each with their own window and covariate), iteratively (similar to how many machine learning models are trained, using blocks of training data). For example, lets say I have 10-years of daily data which I'd like to use to train the model, and another 10-years of moisture variables over which I'd like to simulate point patterns. Again, I considered the update function here as well, but it was unclear if the new model would simply be based ONLY on the new data, or a combination of the original and new data.

Please let me know if I'm going the completely wrong direction with this. References and resources appreciated.

Adrian Baddeley · Accepted Answer

If you have fitted a model using ppm and you update it by specifying new data and/or new covariates, then the new data replace the old data; the updated model's parameters are determined using only the new data that you gave when you called update.

The syntax for the update command is described in the online help for update.ppm (the method for the generic update for an object of class ppm).

It seems that what you really want to do is to fit a point process model to many replicate datasets, each dataset consisting of a predictor moistureVar and a point pattern obsPPP. In that case, you should use the function mppm which fits a point process model to replicated data.

To do this, first make a list A containing the moisture regions for each day, and another list B containing the hail report location patterns for each day. That is, A[[1]] is the moisture region for day 1, and B[[1]] is the point pattern of hail report locations for day 1, and so on. Then do

   h <- hyperframe(moistureVar=A, obsPPP=B)
   m <- mppm(obsPPP ~ moistureVar, data=h)

This will fit a single point process model to the full set of data.

Finally can I point out that the model

  obsPPP ~ moistureVar

is very simple, because moistureVar is a binary predictor. The model will simply say that the intensity of hail reports takes one value inside the high-moisture region, and another value outside that region. As an alternative, you could consider use the moisture content (eg humidity) as a predictor variable.

See Chapters 9 and 16 of the spatstat book for more detail.

Training and simulating a spatstat ppm using multiple datasets

Answers (1)

Related Questions