Reputation: 85
My data (results) looks like this:
1 2 3 4 5 6 7 8 9 10 subsites sites
0.3207679 0.5831471 0.8062113 1.000211 1.17139 1.324008 1.461217 1.585459 1.698675 1.802433 1 1
0.5519985 0.9743214 1.3157794 1.600919 1.84415 2.054966 2.240087 2.40447 2.551864 2.68515 2 1
0.7527316 1.2980157 1.7215702 2.064576 2.350302 2.59345 2.803964 2.988854 3.153205 3.300787 3 1
0.9410568 1.5892921 2.0769323 2.463184 2.779815 3.046059 3.27444 3.473503 3.64928 3.806147 1 2
1.106054 1.834043 2.3672041 2.782478 3.119263 3.400492 3.640566 3.849008 4.032385 4.195388 2 2
1.262294 2.061886 2.6353753 3.0767 3.431931 3.72695 3.977544 4.19394 4.383119 4.550062 3 2
I would like to plot in graduated different colours, all subsites in each site, so site 1 for example would have subsite 1 dark blue, subsite 2, lighter blue, etc. Site 2 would have dark green for subsite 1 and lighter green for subsite 2 etc. I have tried to use reshape
and ggplot2
but the graphs don't even take on the form I want and I can't figure out why.
I am trying to get a series of curved lines like this first image, but the output is much different (second graph).
Here's my code:
meltdf <- melt(results,id.vars=c("sites","subsites"), measure.vars=c(1:10), value.name="rawdata",variable.name="Days")
ggplot(meltdf,aes(x=Days,y=rawdata,colour=subsites,group=sites)) + geom_line()
Could someone please tell me how to melt my data so it generates the graph I need and how to make graduated colours within each group? many thanks.
Upvotes: 3
Views: 2429
Reputation: 59425
This seems close.
library(ggplot2)
library(reshape2)
library(RColorBrewer) # for brewer.pal(...)
df <- cbind(id=1:nrow(df),df)
gg <- melt(df, id=c("id","subsites","sites"))
gg$variable=as.numeric(substr(gg$variable,2,4))
ggplot(gg)+
geom_line(aes(x=variable,y=value ,color=factor(id)),size=1.5)+
scale_color_manual("Site",values=c(rev(brewer.pal(3,"Blues")),
rev(brewer.pal(3,"Greens"))),
breaks=c(1,4), labels=unique(gg$sites))+
labs(x="",y="")+
theme_bw()
df
is your data from the question (2 sites, 3 subsites).
The basic idea is to add an id
column to your data.frame, then melt, then group on id
using the color aesthetic. Now you have six colors. To make them blues for site 1 and greens for site 2 we use scale_color_manual(...)
to create a custom list of color values using the Blues color palette for the first three, and the Greens color palette for the last three. Then we set the legend breaks=c(1,4)
so that the legend displays the darkest Blue/Green. The palettes themselves come from www.colorbrewer.org, implemented in R in package RColorBrewer
.
Edit [Response to OP's request in the comments.
With the complete (or more complete) dataset, this question illustrates two key principles:
In essence OP has response ~ time
data for 4 sites, where each site has between 7 and 10 subsites; so in total 36 time series. OP wishes to display these all on one plot, and hopes to distinguish them by having each site associated with a different base color (e.g. blue for site 1, green for site 2, etc.), and having the subsites distinguished by a color ramp in each color from dark to light. So, (site 1, subsite 1)=dark blue
, (site 1, subsite 10)=light blue
, etc.
To achieve this we need a generalized version of the approach above. Each curve gets its own color (so, 36 colors). We then create a manual color scale using 4 different ramps, each with the appropriate number of colors, in the right order. The code is below, again assuming OP's dataset is stored in a data.frame df
.
library(ggplot2)
library(reshape2)
library(RColorBrewer)
library(colorRamps)
df <- read.csv("subset_for_dropbox.csv")
df <- cbind(id=1:nrow(df),df)
sites <- aggregate(subsites~Sites,df,length) # number of subsite for each site
sites$brks <- c(1,1 + cumsum(sites$subsites)[1:(nrow(sites)-1)])
site.palettes <- c("Blues","Greens","Reds","Purples")
colors <- unlist(apply(sites,1,function(x){colorRampPalette(rev(brewer.pal(9,site.palettes[x[1]]))[1:6])(x[2])}))
gg <- melt(df, id=c("id","subsites","Sites"))
gg$variable=as.numeric(substr(gg$variable,4,6))
# all curves on one plot
ggplot(gg)+
geom_line(aes(x=variable,y=value ,color=factor(id)),size=1.5)+
scale_color_manual("Site",values=colors,
breaks=sites$brks, labels=unique(gg$Sites))+
labs(x="",y="")+ xlim(0,10) +
theme_bw()
It should be evident that this is not a good way to visualize the data. A better approach uses facets:
# faceted, color identifies site, color ramp identifies subsite
ggplot(gg)+
geom_line(aes(x=variable,y=value ,color=factor(id)),size=1.5)+
scale_color_manual("Site",values=colors,
breaks=sites$brks, labels=unique(gg$Sites))+
labs(x="",y="")+ xlim(0,10) +
theme_bw()+
facet_wrap(~Sites,nrow=1)
The problem with this plot is that you don't know which subsite goes with which color (is subsite 1 darkest, or subsite 10?). So a less colorful, but better approach uses facets to identify the sites, and the color ramp to identify the subsites:
# faceted, color ramp identifies subsite
ggplot(gg)+
geom_line(aes(x=variable,y=value ,color=factor(subsites)),size=1.5)+
scale_color_manual("subsite",values=colorRampPalette(rev(brewer.pal(9,"Blues")[4:9]))(max(sites$subsites)))+
labs(x="",y="")+ xlim(0,10) +
theme_bw()+
facet_wrap(~Sites,nrow=1)
Upvotes: 3