Reputation: 27
In R is there a way to randomly generate values within a set extent from a given point. For example if I have coordinates and wish to generate 10 samples within an surrounding error field, can this be done? And if so, can the characteristics of the error field be defined, i.e a square or circle surrounding the original point. Any insight much appreciated.
Example:(WGS84 ESPG:4326)
Longitude Latitude ErrLong ErrLat
-91.98876953 1.671900034 0.53 1.08
-91.91790771 1.955003262 0.53 1.08
-91.91873169 1.961261749 0.53 1.08
-91.86060333 1.996331811 0.53 1.08
-91.67115021 1.929548025 0.12 0.12
-90.67552948 1.850875616 0.12 0.12
-90.65361023 1.799352288 0.12 0.12
-92.13287354 0.755102754 0.12 0.12
-92.13739014 0.783674061 0.12 0.12
-88.16407776 -4.953748703 0.12 0.12
-82.51725006 -5.717019081 0.12 0.12
-82.50763702 -5.706347942 0.12 0.12
-82.50556183 -5.696153641 0.12 0.12
-82.50305176 -5.685819626 0.12 0.12
-82.18003845 -5.623015404 0.53 1.08
-82.17269897 -5.61870575 0.53 1.08
-82.16355133 -5.612465382 0.12 0.12
For each long/lat I would like 10 randomly generated points within the stated error long/lat (in degrees) from the original location. The random samples should follow a normal distribution and the error field is circular (when lat/long error is equal) and elliptical if not.
Upvotes: 1
Views: 209
Reputation: 72653
You could draw from a truncated normal using msm::rtnorm
.
First, to make things easier, I'd convert the data into long format.
dat <- cbind(id=1:nrow(dat), dat) ## add ID column
names(dat)[-1] <- c("value.lon", "value.lat", "err.lon", "err.lat") ## better names
## reshape to long
dat.l <- reshape(dat, varying=2:5, direction="long")
dat.l[c(1:2, 15:20), ]
# id time value err
# 1.lon 1 lon -91.988770 0.53
# 2.lon 2 lon -91.917908 0.53
# 15.lon 15 lon -82.180038 0.53
# 16.lon 16 lon -82.172699 0.53
# 1.lat 1 lat 1.671900 1.08
# 2.lat 2 lat 1.955003 1.08
# 3.lat 3 lat 1.961262 1.08
# 4.lat 4 lat 1.996332 1.08
Now we use msm::rtnorm
taking value
as the mean
and err
as the absolute value of a confidence interval as well as the truncation points. To get the list nicely separated into lon
and lat
we use by
.
R. <- 1e3
set.seed(42)
res <- by(dat.l, dat.l$time, function(s)
sapply(1:nrow(s), function(m, R=R.) {
x <- as.double(unlist(s[m, -(1:2)]))
o <- msm::rtnorm(R, x[1], abs((x[1] - x[2]))/1.96, x[1] - x[2], x[1] + x[2])
}))
The result looks like this (using R. <- 9
) for sake of brevity:
res
# dat.l$time: lat
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
# [1,] 2.059389 2.854458 1.6480049 1.578799 1.857519 1.933703 1.693664 0.6670599 0.7215978
# [2,] 1.817794 2.435360 0.9810172 1.433516 1.820929 1.844537 1.722964 0.7541789 0.7772778
# [3,] 1.363776 1.499776 2.3656603 2.753531 1.951757 1.911148 1.755089 0.6590040 0.8097877
# [4,] 1.298948 2.903252 1.3621228 2.685882 1.902042 1.850533 1.824228 0.6813604 0.7081114
# [5,] 1.976920 2.017745 2.1074160 2.823800 1.950198 1.785133 1.762703 0.7199149 0.8322832
# [6,] 1.664815 1.664443 1.6482465 1.441457 1.899035 1.807138 1.810606 0.7456769 0.8074188
# [7,] 1.736728 1.494439 2.2212244 1.744971 1.987707 1.835817 1.878827 0.7938251 0.8730894
# [8,] 1.518350 1.541916 1.9629348 1.386725 1.985631 1.833966 1.809587 0.7365271 0.7162421
# [9,] 1.761203 1.667451 1.7359951 2.712280 1.849972 1.965899 1.818468 0.8044030 0.7862688
# [,10] [,11] [,12] [,13] [,14] [,15] [,16]
# [1,] -4.909253 -5.611472 -5.673014 -5.688496 -5.668813 -5.117575 -6.365792
# [2,] -5.024007 -5.691572 -5.601893 -5.752438 -5.771032 -5.795218 -5.392146
# [3,] -4.959013 -5.636268 -5.791113 -5.639635 -5.670745 -5.902636 -4.946774
# [4,] -5.031824 -5.609281 -5.650881 -5.730072 -5.680132 -4.940293 -5.801787
# [5,] -4.984777 -5.774233 -5.807611 -5.711324 -5.801857 -4.618648 -5.821920
# [6,] -4.967051 -5.760783 -5.692485 -5.770230 -5.744132 -6.684446 -6.646540
# [7,] -4.929440 -5.648386 -5.798339 -5.728268 -5.669888 -5.140643 -6.525713
# [8,] -5.031480 -5.609127 -5.646710 -5.579407 -5.787876 -4.587991 -4.771850
# [9,] -5.071611 -5.763129 -5.621419 -5.606133 -5.592998 -6.402314 -4.752597
# ----------------------------------------------------------------------
# dat.l$time: lon
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
# [1,] -92.12306 -92.27813 -91.89380 -91.96530 -91.70359 -90.59310 -90.60037 -92.12645
# [2,] -92.08298 -91.73772 -91.74796 -92.32808 -91.57151 -90.55784 -90.69050 -92.11317
# [3,] -91.94673 -91.83403 -91.66878 -91.60644 -91.66306 -90.75866 -90.66495 -92.11768
# [4,] -92.33240 -91.57389 -92.15855 -92.03448 -91.75625 -90.63687 -90.58756 -92.11370
# [5,] -92.17743 -91.58370 -91.82970 -91.44922 -91.72398 -90.75778 -90.62202 -92.15861
# [6,] -92.39499 -91.41112 -92.36735 -92.12330 -91.78401 -90.68612 -90.56967 -92.05469
# [7,] -92.40120 -92.02109 -91.57844 -92.07230 -91.75370 -90.72048 -90.64158 -92.24910
# [8,] -92.08168 -92.10115 -91.98592 -91.33367 -91.58579 -90.60831 -90.65058 -92.17405
# [9,] -91.90599 -91.41466 -91.49233 -91.62150 -91.61410 -90.60368 -90.75319 -92.01950
# [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16]
# [1,] -92.16208 -88.17055 -82.51806 -82.50556 -82.54585 -82.49562 -81.76493 -81.84638
# [2,] -92.25042 -88.27982 -82.50876 -82.61386 -82.49595 -82.40652 -82.31069 -82.34158
# [3,] -92.20928 -88.08214 -82.55565 -82.43839 -82.48540 -82.55503 -82.38119 -81.84021
# [4,] -92.16342 -88.08550 -82.60778 -82.40032 -82.61227 -82.55625 -82.70171 -82.46027
# [5,] -92.02135 -88.09106 -82.44550 -82.51054 -82.54662 -82.40365 -81.91754 -81.83588
# [6,] -92.02523 -88.22512 -82.58183 -82.43660 -82.51187 -82.47769 -82.56931 -81.86314
# [7,] -92.18523 -88.27581 -82.51715 -82.45542 -82.40686 -82.59609 -81.75961 -82.62096
# [8,] -92.09482 -88.23731 -82.43151 -82.51785 -82.45835 -82.54335 -82.45329 -81.75484
# [9,] -92.07861 -88.18889 -82.60739 -82.46636 -82.48639 -82.41555 -82.11490 -82.59231
Comparison with specified error ranges:
lapply(res, function(x) cbind(mean=colMeans(x), err=apply(x, 2, function(x)
max(abs(range(x - mean(x))))
)))
# $lat
# mean err
# [1,] 1.6641013 1.0633450
# [2,] 1.9512697 1.0791531
# [3,] 1.9664345 1.0766429
# [4,] 1.9827845 1.0752871
# [5,] 1.9284320 0.1210392
# [6,] 1.8525683 0.1213176
# [7,] 1.8010929 0.1214542
# [8,] 0.7511818 0.1237103
# [9,] 0.7871224 0.1228840
# [10,] -4.9542575 0.1203926
# [11,] -5.7174928 0.1200936
# [12,] -5.7064194 0.1198188
# [13,] -5.6925109 0.1234913
# [14,] -5.6876203 0.1217520
# [15,] -5.6436551 1.1001096
# [16,] -5.5955709 1.1015958
#
# $lon
# mean err
# [1,] -91.99891 0.5390560
# [2,] -91.91370 0.5327020
# [3,] -91.92065 0.5312584
# [4,] -91.84195 0.5476753
# [5,] -91.67497 0.1229412
# [6,] -90.67413 0.1212662
# [7,] -90.64743 0.1261391
# [8,] -92.13235 0.1204769
# [9,] -92.13511 0.1214228
# [10,] -88.16036 0.1235441
# [11,] -82.51747 0.1198272
# [12,] -82.50483 0.1225459
# [13,] -82.50418 0.1212391
# [14,] -82.50338 0.1202114
# [15,] -82.16850 0.5410282
# [16,] -82.16828 0.5330564
Looks not too bad.
And the distributions look like so (using R. <- 1e3
):
Longitudes:
Latitudes:
Data:
dat <- read.table(header=TRUE, text='Longitude Latitude ErrLong ErrLat
-91.98876953 1.671900034 0.53 1.08
-91.91790771 1.955003262 0.53 1.08
-91.91873169 1.961261749 0.53 1.08
-91.86060333 1.996331811 0.53 1.08
-91.67115021 1.929548025 0.12 0.12
-90.67552948 1.850875616 0.12 0.12
-90.65361023 1.799352288 0.12 0.12
-92.13287354 0.755102754 0.12 0.12
-92.13739014 0.783674061 0.12 0.12
-88.16407776 -4.953748703 0.12 0.12
-82.51725006 -5.717019081 0.12 0.12
-82.50763702 -5.706347942 0.12 0.12
-82.50556183 -5.696153641 0.12 0.12
-82.50305176 -5.685819626 0.12 0.12
-82.18003845 -5.623015404 0.53 1.08
-82.17269897 -5.61870575 0.53 1.08')
Upvotes: 1