Reputation: 73
I want to plot lines for separate data frames in the same graphic with a different color for each data frame. I can get a legend using almost the same code and aes(colour = "hard-coded-name") but I don't know the names ahead of time. I don't have enough RAM to rbind the data frames into a single data frame. I've written a sample that produces the plot with the colored lines. How do I add a legend? As in the sample, you don't know ahead of time how many data frames are in the list (ldf) or what their names are.
library('ggplot2')
f30 <- function() {
###############################################################
##### Create a list with a random number of data frames #######
##### The names of the list elements are "random" #######
###############################################################
f1 <- function(i) {
b <- sample(1:10, sample(8:10, 1))
a <- sample(1:100, length(b))
data.frame(Before = b, After = a)
}
ldf <- sapply(1:sample(2:8,1), f1, simplify = FALSE)
names(ldf) <- LETTERS[sample(1:length(LETTERS), length(ldf))]
palette <- c(
"#000000", "#E69F00", "#56B4E9", "#009E73",
"#F0E442", "#0072B2", "#D55E00", "#CC79A7"
)
###############################################################
##### Above this point we're just creating a sample ldf #######
###############################################################
ePlot <- new.env(parent = emptyenv())
fColorsButNoLegend <- function(ix) {
df <- ldf[[ix]]
n <- names(ldf)[ix]
if (ix == 1) {
ePlot$p <- ggplot(df, aes(x = Before, y = After)) +
geom_line(colour = palette[ix])
} else {
ePlot$p <- ePlot$p +
geom_line(
colour = palette[ix],
aes(x = Before, y = After),
df
)
}
}
sapply(1:length(ldf), fColorsButNoLegend)
#Add the title and display the plot
a <- paste(names(ldf), collapse = ', ')
ePlot$p <- ePlot$p +
ggtitle(paste("Before and After:", a))
ePlot$p
}
Upvotes: 0
Views: 74
Reputation: 73
Serendipitously, I saw how another graph package provides an alternative to a legend that saves screen real estate and, I would think, is more efficient than adding a column or duplicating data. I thought I would provide it here in case others might find it useful. It embeds the legend info in the empty space of the graph itself. See the fAnnotate function - which is primitive but enough to provide the germ of an idea.
f30 <- function() {
###############################################################
##### Create a list with a random number of data frames #######
##### The names of the list elements are "random" #######
###############################################################
f1 <- function(i) {
b <- sample(1:10, sample(8:10, 1))
a <- sample(1:100, length(b))
data.frame(Before = b, After = a)
}
ldf <- sapply(1:sample(2:8,1), f1, simplify = FALSE)
names(ldf) <- LETTERS[sample(1:length(LETTERS), length(ldf))]
palette <- c(
"#000000", "#E69F00", "#56B4E9", "#009E73",
"#F0E442", "#0072B2", "#D55E00", "#CC79A7"
)
###############################################################
##### Above this point we're just creating a sample ldf #######
###############################################################
ePlot <- new.env(parent = emptyenv())
ePlot$xMin <- Inf
ePlot$xMax <- -Inf
ePlot$yMin <- Inf
ePlot$yMax <- -Inf
fColorsButNoLegend <- function(ix) {
df <- ldf[[ix]]
#Compute the boundaries of x and y
ePlot$xMin <- min(ePlot$xMin, min(df$Before))
ePlot$xMax <- max(ePlot$xMax, max(df$Before))
ePlot$yMin <- min(ePlot$yMin, min(df$After))
ePlot$yMax <- max(ePlot$yMax, max(df$After))
n <- names(ldf)[ix]
if (ix == 1) {
ePlot$p <- ggplot(df, aes(x = Before, y = After)) +
geom_line(colour = palette[ix])
} else {
ePlot$p <- ePlot$p +
geom_line(
colour = palette[ix],
aes(x = Before, y = After),
df
)
}
}
sapply(1:length(ldf), fColorsButNoLegend)
#Divide by length+1 to leave room on either side of the labels
xGap <- (ePlot$xMax - ePlot$xMin) / (length(ldf) + 1)
fAnnotate <- function(ix) {
x <- ePlot$xMin + (ix * xGap)
lbl <- paste('---', names(ldf)[ix])
b <- palette[ix]
ePlot$p <- ePlot$p +
annotate("text", x = x, y = -Inf, vjust = -1, label = lbl, colour = b)
}
sapply(1:length(ldf), fAnnotate)
#Add the title and display the plot
allNames <- paste(names(ldf), collapse = ', ')
ePlot$p <- ePlot$p +
ggtitle(paste("Before and After:", allNames))
ePlot$p
}
Upvotes: 0
Reputation: 93761
Let's put aside, for the moment, the issue of whether you would ever need to make a line plot with more data than could be held in RAM. Since the list elements are named, you can use those names to generate a color legend, even if you don't know beforehand what those names will be.
For example, in the code below, I add the name of the list element as a new source
column in the data frame, and then use that source
column as the colour aesthetic. Then, just before printing the plot, I add a scale_colour_manual
statement in order to set the line colors to your color palette
:
ePlot <- new.env(parent = emptyenv())
fColorsButNoLegend <- function(ix) {
df <- ldf[[ix]]
# Add name of list element as a new column
df$source = names(ldf)[ix]
if (ix == 1) {
ePlot$p <- ggplot(df, aes(x = Before, y = After, colour=source)) +
geom_line()
} else {
ePlot$p <- ePlot$p +
geom_line(
aes(x = Before, y = After, colour=source),
df
)
}
}
sapply(1:length(ldf), fColorsButNoLegend)
#Add the title and display the plot
a <- paste(names(ldf), collapse = ', ')
ePlot$p <- ePlot$p +
ggtitle(paste("Before and After:", a)) +
scale_colour_manual(values=palette)
ePlot$p
Here's sample output from the function:
f30()
Upvotes: 1