Reputation: 312
I'm trying to replicate the beautiful visualization at Google's Rhythm of Food with my own data set showing how many people my company hired per week. The dataset (named hiresbyweek) looks like this (this is 25 of 81 rows, link to full dataset here)
Week Year total.Hires Month WeekNum
2014-05-05 0:00:00 2014 1 May 18
2014-05-12 0:00:00 2014 1 May 19
2014-05-19 0:00:00 2014 1 May 20
2014-05-26 0:00:00 2014 1 May 21
2014-08-04 0:00:00 2014 1 August 31
2014-09-08 0:00:00 2014 1 September 36
2015-02-23 0:00:00 2015 3 February 08
2015-03-23 0:00:00 2015 4 March 12
2015-05-04 0:00:00 2015 1 May 18
2015-06-01 0:00:00 2015 1 June 22
2015-06-08 0:00:00 2015 1 June 23
2015-09-14 0:00:00 2015 3 September 37
2015-09-21 0:00:00 2015 4 September 38
2015-09-28 0:00:00 2015 15 September 39
2015-10-05 0:00:00 2015 20 October 40
2015-10-12 0:00:00 2015 47 October 41
2015-10-19 0:00:00 2015 40 October 42
2015-10-26 0:00:00 2015 39 October 43
2015-11-02 0:00:00 2015 5 November 44
2015-11-09 0:00:00 2015 2 November 45
2015-11-16 0:00:00 2015 7 November 46
2015-11-23 0:00:00 2015 1 November 47
2015-11-30 0:00:00 2015 7 November 48
2015-12-07 0:00:00 2015 3 December 49
2015-12-14 0:00:00 2015 7 December 50
Currently I've made it as far as this:
ggplot(hiresbyweek,aes( x=WeekNum, y=total.Hires,fill=as.factor(Year)))
+geom_histogram(stat="identity", aes( x=WeekNum, y=total.Hires,fill=as.factor(Year)))
+coord_polar()
+scale_fill_manual(values=c("#ACD9F4","#005DA6","#EC008C"))
+scale_x_discrete(labels = as.factor(hiresbyweek$Month))
+scale_y_discrete(expand=c(0.5,0))
+theme(text=element_text(family="Avenir")
, axis.ticks = element_blank()
, panel.grid = element_blank()
, panel.background = element_blank()
)
This produces something close:
The Essential problem is:
1) those labels are no where close to where they should be: note how the largest numbers are at October but according to the chart they would be mostly in April or March.
The Nice to haves:
1) I'd like to group and rotate those titles a la the rhythm of food charts, so there would be simpler labels
2) I'd like to greatly reduce the relative size of said bars; I've done it as count (geom_historgram(stat="count") or stat="bin") but that makes them all equal and removes the importance of scale, which is the key thing here.
3) I'd like to insert some whitespace between the bars. I've tried adding in color="white" a la both ggplot(hiresbyweek,aes( x=WeekNum, y=total.Hires,colour="white",fill=as.factor(Year))) and geom_histogram(stat="identity", aes( x=WeekNum, y=total.Hires,fill=as.factor(Year), color="white")) which both oddly got a pink outline...
help on the first part is most important (I'd feel it was presentable then) but any and all welcome. Thank you for your time and thoughts.
Upvotes: 3
Views: 213
Reputation: 3241
I've been waiting for someone else to post a better and less hackish answer, but I hope this will do in the meantime.
# 1. We can control the order of geom_bars based on the levels of the factor of X.
# So we make a new factor variable and ensure that the levels are in the order of
# < January1, January2, ..., February2, ..., December3, December4 >
hiresbyweek <- hiresbyweek[order(hiresbyweek$WeekNum),]
hiresbyweek$X <- factor(paste0(hiresbyweek$WeekNum, hiresbyweek$Month),
levels = unique(paste0(hiresbyweek$WeekNum, hiresbyweek$Month)))
# 2. But we don't want the axis labels to be: "Jan1, Jan2, Jan3, ..."
# Instead we'll extract only the month out of the X variable (though notice the weekNum
# variable was important so we could get the right order and distinct factor levels)
# But we also don't want repeated axis labels: "Jan, "Jan", "Jan", "Feb", "Feb", ....
# So try to place the unique axis label close to the middle, and leave the rest blank
# (ie. "", "Jan", "", "", "Feb")
makeLabels <- function(x) {
x <- gsub("[0-9]", "", x)
labs <- c();
for (a in unique(x)) {
b <- rep("", length(x[x == a]))
b[ ceiling(length(x[x==a])/2) ] <- a
labs <- append(labs, b)
}
return(labs)
}
# 3. Angle the axis labels to imitate Google's Rhythm of Food
ang <- -360 / length(unique(hiresbyweek$X)) * seq_along(hiresbyweek$X)
ang[ang <= -90 & ang >= -300] <- ang[ang <= -90 & ang >= -300] -180
ggplot(hiresbyweek, aes( x = X, y = total.Hires,fill = as.factor(Year))) +
geom_histogram(stat="identity", width = 0.5) + # Use width arg for more space between bars
coord_polar() +
scale_x_discrete(labels = makeLabels) + # Apply getLabel function to X
scale_y_discrete(expand=c(0.5,0)) +
scale_fill_manual(values=c("#ACD9F4","#005DA6","#EC008C")) +
theme(axis.ticks = element_blank(),
panel.grid = element_blank(),
panel.background = element_blank(),
text = element_text(family="Avenir"),
title = element_blank(), # Remove all titles
axis.text.x = element_text(angle= ang)) # Apply angles to x-axis labels
Upvotes: 3