Reputation: 173
I'm trying to examine the residuals of my model on a map, using ggplot.
My data looks something like this below.
LAT LONG residuals prevSampling
2668 42.92890 -73.96417 -0.9456018 no
2653 43.06538 -77.03785 -0.9178303 yes
2579 42.45123 -78.86276 -0.9032406 no
2654 42.88848 -78.64891 -0.8738269 yes
2652 43.01445 -78.48273 -0.8539124 yes
2510 42.51378 -78.04134 -0.8493541 yes
I'm first trying to plot the points by Lat/Long. I wanted the size of each point to correspond to the magnitude of the residuals and two different colors, for "yes" and "no" in prevSampling (i.e. size of points will vary for "yes" in one color / size of points will vary for "no" in another color).
I first created a base map with this code:
gg1<-ny_base +
theme_nothing() +
geom_polygon(data = ny_county, fill = NA, color = "white") +
geom_polygon(color = "black", fill = NA)
And then tried to make a plot with this code. I split my data (res2017_occur_loc) into two dataframes (res2017_occur_locY & res2017_occur_locN), by whether the prevSampling is "yes" or "no".
gg1 +
geom_point(data = res2017_occur_locY, aes(x = LONG, y = LAT, size=res2017_occur_locY$residuals,color = "black", fill = "yellow",), shape = 21, group=FALSE) +
geom_point(data = res2017_occur_locN, aes(x = LONG, y = LAT,size=res2017_occur_locN$residuals,color="black",fill="red"), shape = 21, group=FALSE) +
theme(legend.position = c(0, 1),legend.justification = c(0, 1))+
scale_color_manual(values = c("yellow","red"))
I've posted the top of the map to show the issues I'm having.
It gives me a map with different size points of different colors but (1) the legend doesn't have any text next to the legend / I can't figure out how to label it (2) is there away to get an idea of the magnitude of the residuals in the legend?
Thank you so much in advance!
Upvotes: 1
Views: 1028
Reputation: 2091
You don't need to split things up like that. You can specify what to use as a shape, and what to use for a fill. If prevSampling
is a factor, you can just wrap it in factor
within the aes
, otherwise it doesn't need it (like fill = factor(prevSampling)
). If you want the sizes larger for smaller numbers, i.e. larger for -0.9, just add scale_size(trans = "reverse)
to the end.
df <- structure(list(LAT = c(42.9289, 43.06538, 42.45123, 42.88848,
43.01445, 42.51378, 43.31254, 42.4399), LONG = c(-73.96417, -77.03785,
-78.86276, -78.64891, -78.48273, -78.04134, -78.3917, -78.0129
), residuals = c(-0.9456018, -0.9178303, -0.9032406, -0.8738269,
-0.8539124, -0.8493541, -0.3224, -0.2934), prevSampling = c("no",
"yes", "no", "yes", "yes", "yes", "no", "no")), class = "data.frame", row.names = c(NA,
-8L))
library(maps)
usa <- map_data("state")
ny <- subset(usa, region %in% "new york")
p <- ggplot() + geom_polygon(data = ny, aes(x = long, y = lat, group = group), color = "white", fill = "grey10")
p +
geom_point(data = df, aes(x = LONG, y = LAT, size = residuals, fill = prevSampling), shape = 21, group=FALSE) +
theme(legend.position = c(0, 1), legend.justification = c(0, 1)) +
labs(size = "Residuals", fill = "Previous Sampling")
Upvotes: 1
Reputation: 1123
First you should not split out your data, since ggplot can work with it as factor, chase do:
res2017_occur_loc$prevSampling <- as.factor(res2017_occur_loc$prevSampling)
Then change your plot code:
gg1 + geom_point(data = res2017_occur_loc, aes(x = LONG, y = LAT, size=res2017_occur_loc$residuals,colour = res2017_occur_loc$prevSampling), shape = 21, group=FALSE)+ scale_color_manual(values = c("yellow","red")) + theme(legend.position = c(0, 1),legend.justification = c(0, 1)) + labs( colour = 'Prev. Sampling', size= 'Residuals')
This should work. For discrete colors you should use factor variables, and not split the data. I hope that it work for you and help.
Upvotes: 0