Reputation: 11
I have x,y coordinates and the "group" (county) in which each is located. For each county, I want to know the minimum, maximum, and mean distance between the points in the county. I then want to tag each point with its county's min, max, mean distance. Getting min, max, and mean distance over all obs is easy -- but I can't figure out how to get it by county. Here is what I'm using as a test for min:
county <- as.integer(c(1, 1, 1, 2, 2, 2))
x <- c(1.0, 2.0, 5.0, 10., 20., 50.)
y <- c(1.0, 3.0, 4.0, 10., 30., 40.)
xy <- data.frame(county,x,y)
xy$mindist <- min(dist(cbind(xy$x, xy$y)))
The min, max, mean for County 1 is 2.2, 5, and 3.5. The min, max, mean for County 2 is 22.4, 50, and 34.7. The code above tags every point with the global minimum (2.2) rather than tagging all count 1 points with 2.2 and all County 2 points with 22.4. I've tried modifying it by grouping, and using by statements, and aggregate....
Any thoughts?
Upvotes: 1
Views: 767
Reputation: 6181
You can do grouped calculations easily with the dplyr
package. One way is to do the following
xy %>% group_by(county) %>%
summarise(mind = min(dist(cbind(x,y))),
meand = mean(dist(cbind(x,y))),
maxd= max(dist(cbind(x,y))))
which yields
# A tibble: 2 x 4
county mind meand maxd
<int> <dbl> <dbl> <dbl>
1 1 2.236068 3.466115 5
2 2 22.360680 34.661152 50
You could also gather the data together first to reduce the number of cbind calls.
Upvotes: 2