Yun Hyunsoo
Yun Hyunsoo

Reputation: 71

ggplot par new=TRUE option

I am trying to plot 400 ecdf graphs in one image using ggplot. As far as I know ggplot does not support the par(new=T) option.

So the first solution I thought was use the grid.arrange function in gridExtra package. However, the ecdfs I am generating are in a for loop format.

Below is my code, but you could ignore the steps for data processing.

i=1
for(i in 1:400)
  
{
  test<-subset(df,code==temp[i,])   
  test<-test[c(order(test$Distance)),]
  test$AI_ij<-normalize(test$AI_ij)
  AI = test$AI_ij
  
  ggplot(test, aes(AI)) +                             
    stat_ecdf(geom = "step") +
    scale_y_continuous(labels = scales::percent) +     
    theme_bw() +
    new_theme +
    xlab("Calculated Accessibility Value") +
    ylab("Percent")
  
} 

So I have values stored in "AI" in the for loop.

In this case how should I plot 400 graphs in the same chart?

Upvotes: 0

Views: 714

Answers (1)

Allan Cameron
Allan Cameron

Reputation: 173928

This is not the way to put multiple lines on a ggplot. To do this, it is far easier to pass all of your data together and map code to the "group" aesthetic to give you one ecdf line for each code.

By far the hardest part of answering this question was attempting to reverse-engineer your data set. The following data set should be close enough in structure and naming to allow the code to be run on your own data.

library(dplyr)
library(BBmisc)
library(ggplot2)

set.seed(1)
all_codes <- apply(expand.grid(1:16, LETTERS), 1, paste0, collapse = "")
temp      <- data.frame(sample(all_codes, 400), stringsAsFactors = FALSE)
df        <- data.frame(code = rep(all_codes, 100),
                        Distance = sqrt(rnorm(41600)^2 + rnorm(41600)^2),
                        AI_ij = rnorm(41600),
                        stringsAsFactors = FALSE)

Since you only want the first 400 codes from temp that appear in df to be shown on the plot, you can use dplyr::filter to filter out code %in% test[[1]] rather than iterating through the whole thing one element at a time.

You can then group_by code, and arrange by Distance within each group before normalizing AI_ij, so there is no need to split your data frame into a new subset for every line: the data is processed all at once and the data frame is kept together.

Finally, you plot this using the group aesthetic. Note that because you have 400 lines on one plot, you need to make each line faint in order to see the overall pattern more clearly. We do this by setting the alpha value to 0.05 inside stat_ecdf

Note also that there are multiple packages with a function called normalize and I don't know which one you are using. I have guessed you are using BBmisc

So you can get rid of the loop and do:

df %>% 
  filter(code %in% temp[[1]]) %>%
  group_by(code) %>% 
  arrange(Distance, by_group = TRUE) %>% 
  mutate(AI = normalize(AI_ij)) %>%
  ggplot(aes(AI, group = code)) +                  
    stat_ecdf(geom = "step", alpha = 0.05) +
    scale_y_continuous(labels = scales::percent) +     
    theme_bw() +
    xlab("Calculated Accessibility Value") +
    ylab("Percent")

enter image description here

Upvotes: 1

Related Questions