Marijn
Marijn

Reputation: 10557

Refactoring recurring ggplot code

I'm using R and ggplot2 to analyze some statistics from basketball games. I'm new to R and ggplot, and I like the results I'm getting, given my limited experience. But as I go along, I find that my code gets repetitive; which I dislike.

I created several plots similar to this one:

Net Rating by Effective Field Goal percentage

Code:

efgPlot <- ggplot(gmStats, aes(EFGpct, Nrtg)) + 
  stat_smooth(method = "lm") + 
  geom_point(aes(colour=plg_ShortName, shape=plg_ShortName))  + 
  scale_shape_manual(values=as.numeric(gmStats$plg_ShortName))

Only difference between the plots is the x-value; next plot would be:

orPlot <- ggplot(gmStats, aes(ORpct, Nrtg)) + 
  stat_smooth(method = "lm") + ...  # from here all is the same

How could I refactor this, such that I could do something like:

efgPlot <- getPlot(gmStats, EFGpct, Nrtg))
orPlot  <- getPlot(gmStats, ORpct, Nrtg))

Update

I think my way of refactoring this isn't really "R-ish" (or ggplot-ish if you will); based on baptiste's comment below, I solved this without refactoring anything into a function; see my answer below.

Upvotes: 2

Views: 347

Answers (2)

Marijn
Marijn

Reputation: 10557

Although Joran's answer helpt me a lot (and he accurately answers my question), I eventually solved this according to baptiste's suggestion:

# get the variablesI need from the stats data frame:
forPlot <- gmStats[c("wed_ID","Nrtg","EFGpct","ORpct","TOpct","FTTpct",
                     "plg_ShortName","Home")] 
# melt to long format:
forPlot.m <- melt(forPlot, id=c("wed_ID", "plg_ShortName", "Home","Nrtg"))
# use fact wrap to create 4 plots:
p <- ggplot(forPlot.m, aes(value, Nrtg)) +
  geom_point(aes(shape=plg_ShortName, colour=plg_ShortName)) + 
  scale_shape_manual(values=as.numeric(forPlot.m$plg_ShortName)) +
  stat_smooth(method="lm") +
  facet_wrap(~variable,scales="free")

Which gives me:

Net rating as function of four performance indicators

Upvotes: 1

joran
joran

Reputation: 173577

The key to this sort of thing is using aes_string rather than aes (untested, of course):

getPlot <- function(data,xvar,yvar){
    p <- ggplot(data, aes_string(x = xvar, y = yvar)) + 
            stat_smooth(method = "lm") + 
            geom_point(aes(colour=plg_ShortName, shape=plg_ShortName))  + 
            scale_shape_manual(values=as.numeric(data$plg_ShortName))
    print(p)
    invisible(p)
}

aes_string allows you to pass variable names as strings, rather than expressions, which is more convenient when writing functions. Of course, you may not want to hard code to color and shape scales, in which case you could use aes_string again for those.

Upvotes: 6

Related Questions