Andres Gonzalez
Andres Gonzalez

Reputation: 329

Ggplot over multiple dataframes with loops

I'm trying to do the same graph over multiple dataframes that have the same variables with different values. I have n dataframes called df_1, df_2 ... df_n and my code goes like this :

#Create dataframes(In this example n = 3)
df_1 <- data.frame(a1 = 1:1000,
                   b1 = 1:1000)  
df_2 <- data.frame(a1 = 1:1000,
                   b1 = 1:1000)
df_3 <- data.frame(a1 = 1:1000,
                   b1 = 1:1000)

##Store dataframes in list
example.list<-lapply(1:3, function(x) eval(parse(text=paste0("df_", x)))) #In order to store all datasets in one list using their name
names(example.list)<-lapply(1:3, function(x) paste0("df_", x))

#Graph and save for each dataframe
for (i in example.list){
  benp <-  ggplot(i, aes(x=b1)) + 
    geom_histogram(fill="steelblue", aes(y=..density.., alpha=..count..), bins=60) + 
    labs(title="Beneficios", subtitle="") + ylab("Densidad") + 
    xlab("Beneficios ($millones)") + 
    geom_vline(aes(xintercept=mean(b1)), color="red4",linetype="dashed") +
    theme(legend.position = "none") + 
    annotate("text", x= mean(b1), y=0, label=round(mean(b1), digits = 2), 
             colour="red4", size=3.5, vjust=-1.5, hjust=-0.5) 
  ggsave(benp, file=paste0(i,"_histogram.png"))
}   

I'm getting error message "Error in mean(b1): object b1 not found". I don't know how to tell R that b1 comes from dataframe i. Does anybody knows what's wrong with my code or if there is some easier way to plot over multiple dataframes? Thanks in advance!

Upvotes: 2

Views: 1160

Answers (2)

Rex Parsons
Rex Parsons

Reputation: 339

Your problem wasn't in the iteration over the list of dataframes, it was in the use of b1 within the annotate(). Here, I've created a new dataframe within each loop, and called the column name specifically. There is probably a nicer way of doing this, though. Also, the ggsave() needed to call the names of the items in the list, specifically.

library(tidyverse)

#Create dataframes(In this example n = 3)
df_1 <- data.frame(a1 = 1:1000,
                   b1 = 1:1000)  
df_2 <- data.frame(a1 = 1:1000,
                   b1 = 1:1000)
df_3 <- data.frame(a1 = 1:1000,
                   b1 = 1:1000)

##Store dataframes in list
example.list<-lapply(1:3, function(x) eval(parse(text=paste0("df_", x)))) #In order to store all datasets in one list using their name
names(example.list)<-lapply(1:3, function(x) paste0("df_", x))

#Graph and save for each dataframe

for (i in 1:length(example.list)){
  df_i <- example.list[[i]]
  benp <-  
    df_i %>%
    ggplot(aes(x=b1)) + 
    geom_histogram(fill="steelblue", aes(y=..density.., alpha=..count..), bins=60) + 
    labs(title="Beneficios", subtitle="") + ylab("Densidad") + 
    xlab("Beneficios ($millones)") + 
    geom_vline(aes(xintercept=mean(b1)), color="red4",linetype="dashed") +
    theme(legend.position = "none") + 
    annotate("text", x= mean(df_i$b1), y=0, label=round(mean(df_i$b1), digits = 2), 
             colour="red4", size=3.5, vjust=-1.5, hjust=-0.5) 
  ggsave(benp, file=paste0(names(example.list)[i],"_histogram.png"))
}

Upvotes: 2

M.Viking
M.Viking

Reputation: 5408

The get() function is what you are looking for, to evaluate a string as a dataframe.

get() Return the Value of a Named Object

For example:

x <- "iris"
summary(get(x))
#  Sepal.Length    Sepal.Width     Petal.Length    Petal.Width          Species  
# Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100   setosa    :50  
# 1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300   versicolor:50  
# Median :5.800   Median :3.000   Median :4.350   Median :1.300   virginica :50  
# Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199                  
# 3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800                  
# Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500  

Your example:

#Store dataframes in list
graph.list<-lapply(1:10, function(x) eval(parse(text=paste0("data_new", x)))) #In order to store all datasets in one list using their name
names(graph.list)<-lapply(1:10, function(x) paste0("data_new", x))

#Graph and save for each dataframe
for (i in graph.list){
  benp <-  ggplot(get(i), aes(x=b1)) + 
    geom_histogram(fill="steelblue", aes(y=..density.., alpha=..count..), bins=60) + 
    labs(title="Beneficios", subtitle="") + ylab("Densidad") + 
    xlab("Beneficios ($millones)") + 
    geom_vline(aes(xintercept=mean(b1)), color="red4",linetype="dashed") +
    theme(legend.position = "none") + 
    annotate("text", x= mean(b1), y=0, label=round(mean(b1), digits = 2), 
             colour="red4", size=3.5, vjust=-1.5, hjust=-0.5) 
  ggsave(benp, file=paste0(i,"_histogram.png"))
}

Upvotes: 0

Related Questions