Jack
Jack

Reputation: 857

What is Julia's equivalent ggplot code of R's?

I would like to plot a sophisticated graph in Julia. The code below is in Julia's version using ggplot.

using CairoMakie, DataFrames, Effects, GLM, StatsModels, StableRNGs, RCall
@rlibrary ggplot2

rng = StableRNG(42)
growthdata = DataFrame(; age=[13:20; 13:20],
                       sex=repeat(["male", "female"], inner=8),
                       weight=[range(100, 155; length=8); range(100, 125; length=8)] .+ randn(rng, 16))

mod_uncentered = lm(@formula(weight ~ 1 + sex * age), growthdata)

refgrid = copy(growthdata)
filter!(refgrid) do row
    return mod(row.age, 2) == (row.sex == "male")
end
effects!(refgrid, mod_uncentered)

refgrid[!, :lower] = @. refgrid.weight - 1.96 * refgrid.err
refgrid[!, :upper] = @. refgrid.weight + 1.96 * refgrid.err

df= refgrid

ggplot(df, aes(x=:age, y=:weight, group = :sex, shape= :sex, linetype=:sex)) + 
  geom_point(position=position_dodge(width=0.15)) +
  geom_ribbon(aes(ymin=:lower, ymax=:upper), fill="gray", alpha=0.5)+
  geom_line(position=position_dodge(width=0.15)) + 
  ylab("Weight")+ xlab("Age")+
  theme_classic()

enter image description here

However, I would like to modify this graph a bit more. For example, I would like to change the scale of the y axis, the colors of the ribbon, add some error bars, and also change the text size of the legend and so on. Since I am new to Julia, I am not succeding in finding the equivalent language code for these modifications. Could someone help me translate this R code below of ggplot into Julia's language?

t1= filter(df, sex=="male") %>% slice_max(df$weight) 


ggplot(df, aes(age, weight, group = sex, shape= sex, linetype=sex,fill=sex, colour=sex)) + 
  geom_line(position=position_dodge(width=0.15)) +
  geom_point(position=position_dodge(width=0.15)) +
  geom_errorbar(aes(ymin = lower, ymax = upper),width = 0.1,
                linetype = "solid",position=position_dodge(width=0.15))+
  geom_ribbon(aes(ymin = lower, ymax = upper, fill = sex, colour = sex), alpha = 0.2) +
  geom_text(data = t1, aes(age, weight, label = round(weight, 1)), hjust = -0.25, size=7,show_guide  = FALSE) +
  scale_y_continuous(limits = c(70, 150), breaks = seq(80, 140, by = 20))+
  theme_classic()+
  scale_colour_manual(values = c("orange", "blue")) +
  guides(color = guide_legend(override.aes = list(linetype = c('dotted', 'dashed'))),
         linetype = "none")+
  xlab("Age")+ ylab("Average marginal effects") + ggtitle("Title") +
  theme( 
    axis.title.y = element_text(color="Black", size=28, face="bold", hjust = 0.9),
    axis.text.y = element_text(face="bold", color="black", size=16),
    plot.title = element_text(hjust = 0.5, color="Black", size=28, face="bold"),
    legend.title = element_text(color = "Black", size = 13),
    legend.text = element_text(color = "Black", size = 16),
    legend.position="bottom",
    axis.text.x = element_text(face="bold", color="black", size=11),
    strip.text = element_text(face= "bold", size=15)
  ) 

enter image description here

Upvotes: 8

Views: 2238

Answers (2)

call me Steve
call me Steve

Reputation: 1727

I used Vega-Lite (https://github.com/queryverse/VegaLite.jl) which is also grounded in the "Grammar of Graphics", and LinearRegression (https://github.com/ericqu/LinearRegression.jl) which provides similar features as GLM, although I think it is possible to get comparable results with the other plotting and linear regression packages. Nevertheless, I hope that this gives you a starting point.

using LinearRegression: Distributions, DataFrames, CategoricalArrays
using DataFrames, StatsModels, LinearRegression
using VegaLite

growthdata = DataFrame(; age=[13:20; 13:20],
                       sex=categorical(repeat(["male", "female"], inner=8), compress=true),
                       weight=[range(100, 155; length=8); range(100, 125; length=8)] .+ randn(16))

lm = regress(@formula(weight ~ 1 + sex * age), growthdata)

results = predict_in_sample(lm, growthdata, req_stats="all")

fp = select(results, [:age, :weight, :sex, :uclp, :lclp, :predicted]) |> @vlplot() +
@vlplot(
    mark = :errorband, color = :sex,
    y = { field = :uclp, type = :quantitative, title="Average marginal effects"}, 
    y2 = { field = :lclp, type = :quantitative }, 
    x = {:age, type = :quantitative} ) + 
@vlplot(
    mark = :line, color = :sex,
    x = {:age, type = :quantitative},
    y = {:predicted, type = :quantitative}) +
@vlplot(
    :point, color=:sex ,
    x = {:age, type = :quantitative, axis = {grid = false}, scale = {zero = false}},
    y = {:weight, type = :quantitative, axis = {grid = false}, scale = {zero = false}},
    title = "Title", width = 400 , height = 400
)

which gives:

output

You can change the style of the elements by changing the "config" as indicated here (https://www.queryverse.org/VegaLite.jl/stable/gettingstarted/tutorial/#Config-1).

As the Julia Vega-Lite is a wrapper to Vega-Lite additional documentation can be found on the Vega-lite website (https://vega.github.io/vega-lite/)

Upvotes: 3

BatWannaBe
BatWannaBe

Reputation: 4510

As I commented before, you can use R-strings to run R code. To be clear, this isn't like your post's approach where you piece together many Julia objects that wrap many R objects, this is RCall converting a Julia Dataframe to an R dataframe then running your R code.

Running an R script may not seem very Julian, but code reuse is very Julian. Besides, you're still using an R library and active R session either way, and there might even be a slight performance benefit from reducing how often you make wrapper objects and switch between Julia and R.

## import libraries for Julia and R; still good to do at top

using CairoMakie, DataFrames, Effects, GLM, StatsModels, StableRNGs, RCall
R"""
library(ggplot2)
library(dplyr)
"""

## your Julia code without the @rlibrary or ggplot lines

rng = StableRNG(42)
growthdata = DataFrame(; age=[13:20; 13:20],
                       sex=repeat(["male", "female"], inner=8),
                       weight=[range(100, 155; length=8); range(100, 125; length=8)] .+ randn(rng, 16))

mod_uncentered = lm(@formula(weight ~ 1 + sex * age), growthdata)

refgrid = copy(growthdata)
filter!(refgrid) do row
    return mod(row.age, 2) == (row.sex == "male")
end
effects!(refgrid, mod_uncentered)

refgrid[!, :lower] = @. refgrid.weight - 1.96 * refgrid.err
refgrid[!, :upper] = @. refgrid.weight + 1.96 * refgrid.err

df= refgrid

## convert Julia's df and run your R code in R-string
## - note that $df is interpolation of Julia's df into R-string,
##   not R's $ operator like in rdf$weight
## - call the R dataframe rdf because df is already an R function

R"""
rdf <- $df
t1= filter(rdf, sex=="male") %>% slice_max(rdf$weight) 

ggplot(rdf, aes(age, weight, group = sex, shape= sex, linetype=sex,fill=sex, colour=sex)) + 
  geom_line(position=position_dodge(width=0.15)) +
  geom_point(position=position_dodge(width=0.15)) +
  geom_errorbar(aes(ymin = lower, ymax = upper),width = 0.1,
                linetype = "solid",position=position_dodge(width=0.15))+
  geom_ribbon(aes(ymin = lower, ymax = upper, fill = sex, colour = sex), alpha = 0.2) +
  geom_text(data = t1, aes(age, weight, label = round(weight, 1)), hjust = -0.25, size=7,show_guide  = FALSE) +
  scale_y_continuous(limits = c(70, 150), breaks = seq(80, 140, by = 20))+
  theme_classic()+
  scale_colour_manual(values = c("orange", "blue")) +
  guides(color = guide_legend(override.aes = list(linetype = c('dotted', 'dashed'))),
         linetype = "none")+
  xlab("Age")+ ylab("Average marginal effects") + ggtitle("Title") +
  theme( 
    axis.title.y = element_text(color="Black", size=28, face="bold", hjust = 0.9),
    axis.text.y = element_text(face="bold", color="black", size=16),
    plot.title = element_text(hjust = 0.5, color="Black", size=28, face="bold"),
    legend.title = element_text(color = "Black", size = 13),
    legend.text = element_text(color = "Black", size = 16),
    legend.position="bottom",
    axis.text.x = element_text(face="bold", color="black", size=11),
    strip.text = element_text(face= "bold", size=15)
  ) 
"""

The result is the same as your post's R code: OP's R code in RCall's R string results in this plot

Upvotes: 5

Related Questions