ike
ike

Reputation: 312

using ggplot2 to replicate Rhythm of Food Visualization

I'm trying to replicate the beautiful visualization at Google's Rhythm of Food with my own data set showing how many people my company hired per week. The dataset (named hiresbyweek) looks like this (this is 25 of 81 rows, link to full dataset here)

            Week Year total.Hires     Month WeekNum
  2014-05-05 0:00:00 2014           1       May      18
  2014-05-12 0:00:00 2014           1       May      19
  2014-05-19 0:00:00 2014           1       May      20
  2014-05-26 0:00:00 2014           1       May      21
  2014-08-04 0:00:00 2014           1    August      31
  2014-09-08 0:00:00 2014           1 September      36
  2015-02-23 0:00:00 2015           3  February      08
  2015-03-23 0:00:00 2015           4     March      12
  2015-05-04 0:00:00 2015           1       May      18
  2015-06-01 0:00:00 2015           1      June      22
  2015-06-08 0:00:00 2015           1      June      23
  2015-09-14 0:00:00 2015           3 September      37
  2015-09-21 0:00:00 2015           4 September      38
  2015-09-28 0:00:00 2015          15 September      39
  2015-10-05 0:00:00 2015          20   October      40
  2015-10-12 0:00:00 2015          47   October      41
  2015-10-19 0:00:00 2015          40   October      42
  2015-10-26 0:00:00 2015          39   October      43
  2015-11-02 0:00:00 2015           5  November      44
  2015-11-09 0:00:00 2015           2  November      45
  2015-11-16 0:00:00 2015           7  November      46
  2015-11-23 0:00:00 2015           1  November      47
  2015-11-30 0:00:00 2015           7  November      48
  2015-12-07 0:00:00 2015           3  December      49
  2015-12-14 0:00:00 2015           7  December      50

Currently I've made it as far as this:

ggplot(hiresbyweek,aes( x=WeekNum, y=total.Hires,fill=as.factor(Year)))
+geom_histogram(stat="identity", aes( x=WeekNum, y=total.Hires,fill=as.factor(Year)))
+coord_polar()
+scale_fill_manual(values=c("#ACD9F4","#005DA6","#EC008C"))
+scale_x_discrete(labels = as.factor(hiresbyweek$Month))
+scale_y_discrete(expand=c(0.5,0))
+theme(text=element_text(family="Avenir")
       , axis.ticks = element_blank()
       , panel.grid = element_blank()
       , panel.background = element_blank()
       )

This produces something close:

enter image description here

The Essential problem is:

1) those labels are no where close to where they should be: note how the largest numbers are at October but according to the chart they would be mostly in April or March.

The Nice to haves:

1) I'd like to group and rotate those titles a la the rhythm of food charts, so there would be simpler labels

2) I'd like to greatly reduce the relative size of said bars; I've done it as count (geom_historgram(stat="count") or stat="bin") but that makes them all equal and removes the importance of scale, which is the key thing here.

3) I'd like to insert some whitespace between the bars. I've tried adding in color="white" a la both ggplot(hiresbyweek,aes( x=WeekNum, y=total.Hires,colour="white",fill=as.factor(Year))) and geom_histogram(stat="identity", aes( x=WeekNum, y=total.Hires,fill=as.factor(Year), color="white")) which both oddly got a pink outline...

help on the first part is most important (I'd feel it was presentable then) but any and all welcome. Thank you for your time and thoughts.

Upvotes: 3

Views: 213

Answers (1)

Chrisss
Chrisss

Reputation: 3241

I've been waiting for someone else to post a better and less hackish answer, but I hope this will do in the meantime.

# 1. We can control the order of geom_bars based on the levels of the factor of X. 
# So we make a new factor variable and ensure that the levels are in the order of 
# < January1, January2, ..., February2, ..., December3, December4 >  
hiresbyweek <- hiresbyweek[order(hiresbyweek$WeekNum),]
hiresbyweek$X <- factor(paste0(hiresbyweek$WeekNum, hiresbyweek$Month), 
                    levels = unique(paste0(hiresbyweek$WeekNum, hiresbyweek$Month)))

# 2. But we don't want the axis labels to be: "Jan1, Jan2, Jan3, ..."
# Instead we'll extract only the month out of the X variable (though notice the weekNum
# variable was important so we could get the right order and distinct factor levels)
# But we also don't want repeated axis labels: "Jan, "Jan", "Jan", "Feb", "Feb", ....
# So try to place the unique axis label close to the middle, and leave the rest blank
# (ie. "", "Jan", "", "", "Feb")
makeLabels <- function(x) {
  x <- gsub("[0-9]", "", x)
  labs <- c();
  for (a in unique(x)) {
    b <- rep("", length(x[x == a]))
    b[ ceiling(length(x[x==a])/2) ] <- a
    labs <- append(labs, b)
  }
  return(labs)
}

# 3. Angle the axis labels to imitate Google's Rhythm of Food
ang <- -360 / length(unique(hiresbyweek$X)) * seq_along(hiresbyweek$X)
ang[ang <= -90 & ang >= -300] <- ang[ang <= -90 & ang >= -300] -180

ggplot(hiresbyweek, aes( x = X, y = total.Hires,fill = as.factor(Year))) +
  geom_histogram(stat="identity", width = 0.5) + # Use width arg for more space between bars
  coord_polar() + 
  scale_x_discrete(labels = makeLabels) + # Apply getLabel function to X
  scale_y_discrete(expand=c(0.5,0)) + 
  scale_fill_manual(values=c("#ACD9F4","#005DA6","#EC008C")) + 
  theme(axis.ticks = element_blank(), 
    panel.grid = element_blank(), 
    panel.background = element_blank(),
    text = element_text(family="Avenir"),
    title = element_blank(), # Remove all titles
    axis.text.x = element_text(angle= ang)) # Apply angles to x-axis labels

Result: result

Upvotes: 3

Related Questions