Reputation: 464
I have a data set that has two values per row I'd like to plot against each other.
For example:
RHC,1,0.370,0.287,0.003,0.063
SA,1,0.352,0.258,0.003,0.057
GA,1,0.121,0.091,0.430,0.008
I want to plot an individual line per column, grouped by the first column. E.g. for the RHC row, I'm plotting {x,y1} and {x,y2} of {1,0.370} and {1,0.287} respectively.
The following ggplot/geom_smooth accomplishes this:
ggplot(data=d) +
geom_smooth(aes(x=iterations, y=training.error, col=algorithm)) +
geom_smooth(aes(x=iterations, y=testing.error, col=algorithm))
However, both lines end up with a single legend entry and a single color...making them impossible to differentiate.
How can I apply a different color and respective legend entry for each line produced by each geom_smooth
call?
To reproduce:
library(ggplot2)
d <- read.csv("https://gist.githubusercontent.com/jameskyle/8d233dcbd0ad0b66bfdd/raw/9c975ac9d9bbcb633e44cfd70b66f7ab89dc1517/results.csv")
p1 <- ggplot(data=d) +
geom_smooth(aes(x=iterations, y=training.error, col=algorithm)) +
geom_smooth(aes(x=iterations, y=testing.error, col=algorithm))
pdf("graph.pdf")
print(p1)
dev.off()
The above code will produce:
Upvotes: 1
Views: 1541
Reputation: 83215
Because you have several lines quite close to each other in one plot, it is probably better to use facets to get a clearer plot. Therefore the data should be reshaped into long format.
With the data.table
package you can reshape into long format with multiple columns simultaneously:
library(data.table)
# melting operation for the error & time columns simultaneously
# and setting the appropriate labels for the variable column
d1 <- melt(setDT(d),
measure.vars = patterns('.error','.time'),
value.name = c('error','time'))[, variable := c('train','test')[variable]]
Now you can make the facetted plot (I've added a fill as well for differentiating between the shaded areas):
ggplot(data=d1) +
geom_smooth(aes(x=iterations, y=error, col=variable, fill=variable), size=1) +
facet_grid(. ~ algorithm) +
theme_bw()
this results in:
If you really want everything in one plot, you can add a linetype
to the aes
as well in order to better differentiate between the several lines:
ggplot(data=d1) +
geom_smooth(aes(x=iterations, y=error, col=algorithm, linetype=variable), size=1) +
theme_bw()
the result:
Upvotes: 4