Reputation: 4787
I am trying to create a stacked area plot. I have been able to do so for the most part except that one of the variables has negative values and it is not being displayed properly. Here's what my data looks like:
Time A B C D E F G H I J K
11/1/2014 53438.819 0 1902.19 0 3620.333 12861.2876 0 315.61 0 34739.4 0
12/1/2014 53626.763 0 1908.88 0 3633.066 12906.5207 0 316.72 0 34861.58 0
1/1/2015 50744.951 0 1806.3 0 3437.831 12212.946 0 299.7 0 32988.17 0
2/1/2015 50807.599 0 1808.53 0 3442.075 12228.0237 0 300.07 0 33028.9 0
3/1/2015 50932.895 0 1812.99 0 3450.564 12258.1792 0 300.81 0 33110.35 0
4/1/2015 7264.046 8489.086 258.5685 1465.2215 25465.54 1748.2606 259.5347 42.9015 243.1085 -20251.22 6521.221
5/1/2015 7226.457 8445.158 257.2305 1457.6395 25203.163 1739.214 258.1918 42.6795 241.8505 -20015.83 6487.476
6/1/2015 7245.251 8467.122 257.8995 1461.4305 24940.787 1743.7373 258.8632 42.7905 242.4795 -19739.96 6504.349
7/1/2015 6906.952 8071.77 245.8575 1393.1925 24678.41 1662.3177 246.7763 40.7925 231.1575 -19720.43 6200.644
8/1/2015 7009.4 8191.496 243.1815 1378.0285 24416.033 1693.5511 244.0902 40.3485 228.6415 -19383.71 6340.736
9/1/2015 7019.042 8202.763 243.516 1379.924 24153.657 1695.8806 244.426 40.404 228.956 -19114.42 6349.457
I am using the following code (after melt
ing the data frame to) the generate the plot:
p <- ggplot(temp, aes( Time, value)) + theme_bw() +ylab('Monthly Revenue') + xlab('') +
scale_x_date(breaks=x_breaks, labels=x_labels)
p <- p + geom_area(aes(colour = variable, fill= variable), position = 'stack',alpha=0.6) +
theme(axis.text.y=element_text(hjust=0, angle=0),
axis.text.x = element_text(hjust=1, angle=45),
panel.grid.minor.x = element_blank(),
panel.grid.minor.y = element_blank(),
panel.grid.major.x=element_line(color='grey90',linetype='dashed'),
panel.grid.major.y=element_line(color='grey90',linetype='dashed'),
plot.title=element_text(size=20),
axis.text=element_text(size=15),
legend.text=element_text(size=15),
legend.key=element_blank(),
legend.title=element_blank()) +
scale_y_continuous(label=thousand_formatter) +
ggtitle('Any Title') + ylab("Dollars")
and here's the plot that I get:
We can see that the negative values are not being displayed properly. How can we display them properly so that the negative values are clearly distinguishable and are under the X axis?
Any help in this regard would be much appreciated.
EDIT:
I used the following code now (which is just a hack as stacked area plot shows the cumulative sum, this is also the reason I added negative values to the variable B
so that the effect of negative values may be offset):
temp <- temp[,c(1,11,2:10,12)]
temp[6:11,3] <- temp[6:11,3] + (-1 * temp[6:11,2])
temp <- melt(temp,id='Time')
p <- ggplot(temp, aes( Time, value)) + theme_bw() +ylab('Monthly Revenue') + xlab('') +
scale_x_date(breaks=x_breaks, labels=x_labels)
p <- p + geom_area(aes(colour = variable, fill= variable), position = 'stack',alpha=0.6) +
theme(axis.text.y=element_text(hjust=0, angle=0),
axis.text.x = element_text(hjust=1, angle=45),
panel.grid.minor.x = element_blank(),
panel.grid.minor.y = element_blank(),
panel.grid.major.x=element_line(color='grey90',linetype='dashed'),
panel.grid.major.y=element_line(color='grey90',linetype='dashed'),
plot.title=element_text(size=20),
axis.text=element_text(size=15),
legend.text=element_text(size=15),
legend.key=element_blank(),
legend.title=element_blank()) +
scale_y_continuous(label=thousand_formatter) +
ggtitle('Any Title') + ylab("Dollars")
p
and I got the following result:
We can still see that the colors are sorta screwed. Don't know why does pink changes to brown as soon as the values turn negative.
Upvotes: 2
Views: 2121
Reputation: 206207
I still don't quite see this as a stacked area plot, but you could create two groups for each variable to partitiion the negative and positive values. Here's a very rough sketch of how you might .
Here I assume your sample data is in a data.frame named temp
tt<-melt(temp, id.vars="Time")
tt$pos <- tt$value>=0
gg<-expand.grid(Time=unique(tt$Time), pos=unique(tt$pos), variable=unique(tt$variable))
mm<-merge(tt, gg, all=T)
mm$value[is.na(mm$value)]<-0
mm$grp = interaction(mm$variable, factor(mm$pos, levels=c(TRUE,FALSE)))
mm$ymin<-with(mm, ave(value, pos, Time, FUN=function(x) cumsum(c(0,x[-length(x)]))))
mm$ymax<-with(mm, ave(value, pos, Time, FUN=cumsum))
ggplot(mm, aes(x=Time, fill=variable)) + theme_bw() +
geom_ribbon(aes(group=grp, color=variable, ymin=ymin, ymax=ymax), alpha=.6)
basically we just create a ribbon chart and do the stacking ourselves. This will eliminate overlapping.
Upvotes: 2
Reputation: 83215
Why not use a line plot? It's much better in my opinion, especially with negative values:
library(dplyr)
library(tidyr)
library(ggplot2)
temp <- df %>% gather(type, value, -Time)
ggplot(temp, aes(Time, value, group=type, colour=type)) +
geom_line(size=1) +
theme_bw()
which gives:
As you want to make an area plot, I gues you want to show the total as well. You can add that to the plot with:
df$all <- rowSums(df[,-1])
After that you can make a line plot with extra thick line for the total:
ggplot(temp[temp$type!="all",], aes(Time, value, group=type, colour=type)) +
geom_line(size=1) +
geom_line(data=temp[temp$type=="all",], aes(Time, value), colour="black", size=1.5) +
theme_bw()
which gives:
EDIT:
I found a way to hack it into an area plot. Supposing your dataframe before melting is called df
, you should change the order of the columns first with:
df <- df[,c(1,11,6,2,3,4,5,7,8,9,10,12)]
than melt it with (supposing you're using tidyr
):
temp <- df %>% gather(type, value, -Time)
after that you can create your plot with:
ggplot(temp, aes(Time, value, group=type, colour=type)) +
geom_area(aes(colour=type, fill=type), alpha=0.4) +
theme_bw()
which gives:
Upvotes: 2