Abhi
Abhi

Reputation: 13

Mapping Diseases on USA map

I'm looking to plot a limited set of data for Ebola diseases on USA map using ggplot2 and maps.

The parameters are State and Ebola Infected (yes/no)

States containing the virus are the following:

Texas Yes
Newyork Yes

These states are to be colored in Red and other states in the country are to be in green.

I'm not sure how to code this and any help would be appreciated.

Below is the code that I could build using other thread on stackoverflow

library(ggplot2);
library(maps);
library(plyr);
library(mapproj);
ebolad = read.csv('/usa.csv');
#a data set including the state name and whether it is effected or not (yes/no)
colClasses = c('character', 'character', 'numeric')[[2]];
names(ebolad) = c('col', 'region', 'infected');
ebolad$region  = tolower(ebolad$region);
us_state_map = map_data('state');
map_data = merge(ebolad, us_state_map, by = 'region'); 
map_data = arrange(map_data, order);
ggplot(map_data, aes(x = long, y = lat, group = group)) +
  geom_polygon(aes(fill=infected)) +
  geom_path(colour = 'gray', linestyle = 2) +
  scale_fill_brewer('States affected with EBOLA Virus in USA', palette = 'PuRd') +
  coord_map();

Could someone help me with improving the plot

Upvotes: 0

Views: 195

Answers (2)

Abhi
Abhi

Reputation: 13

  #Code to map USA states affected with Ebola Virus
    #import the following libraries
    library(ggplot2);
    library(maps);
    library(plyr);
    #begin of code
    # read the csv file containing the ebola data for usa (important: replace the directory path)
    ebolad = read.csv('usa.csv');
    colClasses = c('character', 'character', 'numeric')[[2]];
    names(ebolad) = c('col', 'region', 'infected');
    ebolad$region  = tolower(ebolad$region);
    # import the usa state data into local dataset
    us_state_map = map_data('state');
    # merge ebola data set and usa maps data set
    map_data = merge(ebolad, us_state_map, by = 'region'); 
    map_data = arrange(map_data, order);
    # storing the data of abbreviated state names to display on the final map
    states <- data.frame(state.center, state.abb)
    # code to plot the map with state names and colors to distinguish the infected states vs uninfected states
    ggplot(map_data, aes(x = long, y = lat, group = group)) +
    geom_polygon(aes(fill=infected)) +
    geom_path(colour = 'gray', linestyle = 2) +
    xlab("Longitude") + ylab("Latitude") +
    geom_text(data = states, aes(x = x, y = y, label = state.abb, group = NULL), size = 4)+
    scale_fill_manual('States affected with EBOLA Virus in USA', values=c("green4","red3")) +
    coord_map(project="globular") +
    theme_grey();
    #end of code

Upvotes: 0

Stibu
Stibu

Reputation: 15927

Try using a manual scale for fill instead. That is replace scale_fill_brewer(...) by

scale_fill_manual('States affected with EBOLA Virus in USA', values=c("green","red"))

This does not give the most beautiful green and red, though. But you can use hex codes to define arbitrary colours, for instance values=c("#4daf4a","#e41a1c").

Which value for infected is coloured red may depend on the details of your data. If the not infected states should be coloured green, simply use values=c("red","green") to switch the colours.

If your problem is related to the file usa.csv, it is difficult to help without having the file. I produced the data by the following commands:

ebolad<-data.frame(region=state.name,infected="no",stringsAsFactors=FALSE)
ebolad[ebolad$region %in% c("Texas","New York"),"infected"] <- "yes"

Then, using your code with the change mentioned before, I get a decent plot.

Upvotes: 0

Related Questions