Reputation: 4230
I have two data sets of xy coordinates. The first one has xy coordinates plus a tag column with my factor levels. I called the data.frame
qq
and it looks like this:
structure(list(x = c(5109, 5128, 5137, 5185, 5258, 5324, 5387,
5343, 5331, 5347, 5300, 5180, 4109, 4082, 4091, 4139, 4212, 4279,
4291, 4297, 4285, 4301, 4254, 4181), y = c(1692, 1881, 2070,
2119, 2144, 2065, 1987, 1813, 1705, 1649, 1631, 1654, 1847, 2015,
2204, 2253, 2278, 2282, 2166, 1947, 1839, 1783, 1765, 1783),
tag = c("MPN_right", "MPN_right", "MPN_right", "MPN_right",
"MPN_right", "MPN_right", "MPN_right", "MPN_right", "MPN_right",
"MPN_right", "MPN_right", "MPN_right", "MPN_left", "MPN_left",
"MPN_left", "MPN_left", "MPN_left", "MPN_left", "MPN_left",
"MPN_left", "MPN_left", "MPN_left", "MPN_left", "MPN_left"
)), .Names = c("x", "y", "tag"), row.names = c(NA, -24L), class = "data.frame")
I generated random data for the other one using qq
xy
means with a large sd
.
set.seed(123)
my_points=data.frame(x=rnorm(n =1000,mean=mean(qq$x),sd=1000),
y=rnorm(n=1000,mean=mean(qq$y),sd=1000))
if I use the in.out
function from mgcv
package, I obtain somewhat what I want.
The main issues with this approach is that my 'Polygon' is not closed nor will be interpreted as 2 polygons by factor. The package advises to use one NA row in between but I'd rather use my tag column since I will be trying to use more than 2 levels in my tag factor, namely, more than 2 polygons). My final goal is to produce a table of the number of points within each one.
Upvotes: 3
Views: 1035
Reputation: 6796
lapply()
and sapply()
help you to use function par level.
## a bit edited to make output clear
library(dplyr); library(mgcv)
TAG <- unique(qq$tag)
IN.OUT <- lapply(TAG, function(x) as.matrix(qq[qq$tag==x, 1:2])) %>% # make a matrix par level
sapply(function(x) in.out(x, as.matrix(my_points))) # use in.out() with each matrix
colnames(IN.OUT) <- TAG
head(IN.OUT, n = 3)
# MPN_right MPN_left
# [1,] FALSE FALSE
# [2,] FALSE FALSE
# [3,] FALSE FALSE
apply(IN.OUT, 2, table)
# MPN_right MPN_left
# FALSE 983 990
# TRUE 17 10
Upvotes: 1
Reputation: 5932
what about this:
mysppoint <- SpatialPoints(coords = my_points) # create spatial points
qq$tag <- as.factor(qq$tag)
polys = list()
# create one polygon for each factor level
for (lev in levels(qq$tag)){
first_x <- qq$x[qq$tag == lev][1]
first_y <- qq$y[qq$tag == lev][1]
qq <- rbind(qq, data.frame(x = first_x, y = first_y, tag = lev)) # "close" the polygon by replicating the first row
polys[[lev]] <- Polygons(list(Polygon(matrix(data = cbind(qq$x[qq$tag == lev], # transform to polygon
qq$y[qq$tag == lev]),
ncol = 2))), lev)
}
mypolys <- SpatialPolygons(polys) # convert to spatial polygons
inters <- factor(over(mysppoint, mypolys), labels = names(mypolys)) # intersect points with polygons
table(inters)
, which gives:
inters
MPN_left MPN_right
10 17
advantage of this is that it gives you proper spatial objects to work with. For example:
plotd <- fortify(mypolys )
p <- ggplot()
p <- p + geom_point(data = my_points, aes(x = x , y = y), size = 0.2)
p <- p + geom_polygon(data = plotd, aes(x = long, y = lat, fill = id), alpha = 0.7)
p
Upvotes: 2
Reputation: 4230
I ended up using lapply
and a combination of split
and more lapply
. So here is the code, please ignore the extract_coords
helper function, it basically gives me a dataframe
with the x
,y
and tag columns. I also managed to subset the points from the original your_coords
and count them (returning them as a vector instead of a table).
inside_ROI = function(your_ROI_zip,your_coords){
# Helper function will take list from zip ROIs and merge them into a df
qq=extract_coords(your_ROI_zip)
# We use tag for splitting by the region
lista=split(qq,qq$tag)
# We check if they are in or out
who_is_in = lapply(lista,function(t) in.out(cbind(t$x,t$y),x=cbind(your_coords$x,your_coords$y)))
# We sum to get the by area countings
region_sums = unlist(lapply(who_is_in,function(t) sum(as.numeric(t))))
# obtain indices for subset of TRUE values
aa=lapply(who_is_in,function(p) which(p==T))
whos_coords=list()
for (i in aa){
whos_coords = append(whos_coords,values=list(your_coords[i,]))
# whos_coords[[i]] = your_coords[i,]
}
# Change names
names(whos_coords) = names(aa)
# Put into list for more than one output
out=list(region_sums,aa,whos_coords)
return(out)
}
Upvotes: 1