Reputation: 13
I'm looking to plot a limited set of data for Ebola diseases on USA map using ggplot2 and maps.
The parameters are State and Ebola Infected (yes/no)
States containing the virus are the following:
Texas Yes
Newyork Yes
These states are to be colored in Red and other states in the country are to be in green.
I'm not sure how to code this and any help would be appreciated.
Below is the code that I could build using other thread on stackoverflow
library(ggplot2);
library(maps);
library(plyr);
library(mapproj);
ebolad = read.csv('/usa.csv');
#a data set including the state name and whether it is effected or not (yes/no)
colClasses = c('character', 'character', 'numeric')[[2]];
names(ebolad) = c('col', 'region', 'infected');
ebolad$region = tolower(ebolad$region);
us_state_map = map_data('state');
map_data = merge(ebolad, us_state_map, by = 'region');
map_data = arrange(map_data, order);
ggplot(map_data, aes(x = long, y = lat, group = group)) +
geom_polygon(aes(fill=infected)) +
geom_path(colour = 'gray', linestyle = 2) +
scale_fill_brewer('States affected with EBOLA Virus in USA', palette = 'PuRd') +
coord_map();
Could someone help me with improving the plot
Upvotes: 0
Views: 195
Reputation: 13
#Code to map USA states affected with Ebola Virus
#import the following libraries
library(ggplot2);
library(maps);
library(plyr);
#begin of code
# read the csv file containing the ebola data for usa (important: replace the directory path)
ebolad = read.csv('usa.csv');
colClasses = c('character', 'character', 'numeric')[[2]];
names(ebolad) = c('col', 'region', 'infected');
ebolad$region = tolower(ebolad$region);
# import the usa state data into local dataset
us_state_map = map_data('state');
# merge ebola data set and usa maps data set
map_data = merge(ebolad, us_state_map, by = 'region');
map_data = arrange(map_data, order);
# storing the data of abbreviated state names to display on the final map
states <- data.frame(state.center, state.abb)
# code to plot the map with state names and colors to distinguish the infected states vs uninfected states
ggplot(map_data, aes(x = long, y = lat, group = group)) +
geom_polygon(aes(fill=infected)) +
geom_path(colour = 'gray', linestyle = 2) +
xlab("Longitude") + ylab("Latitude") +
geom_text(data = states, aes(x = x, y = y, label = state.abb, group = NULL), size = 4)+
scale_fill_manual('States affected with EBOLA Virus in USA', values=c("green4","red3")) +
coord_map(project="globular") +
theme_grey();
#end of code
Upvotes: 0
Reputation: 15927
Try using a manual scale for fill
instead. That is replace scale_fill_brewer(...)
by
scale_fill_manual('States affected with EBOLA Virus in USA', values=c("green","red"))
This does not give the most beautiful green and red, though. But you can use hex codes to define arbitrary colours, for instance values=c("#4daf4a","#e41a1c")
.
Which value for infected
is coloured red may depend on the details of your data. If the not infected states should be coloured green, simply use values=c("red","green")
to switch the colours.
If your problem is related to the file usa.csv
, it is difficult to help without having the file. I produced the data by the following commands:
ebolad<-data.frame(region=state.name,infected="no",stringsAsFactors=FALSE)
ebolad[ebolad$region %in% c("Texas","New York"),"infected"] <- "yes"
Then, using your code with the change mentioned before, I get a decent plot.
Upvotes: 0