Jesse
Jesse

Reputation: 244

Plotting multiple grouped variable datasets in ggplot

I'm trying to plot multiple datasets that have grouped variables in ggplot and I am running into a few problems. OK, so I have two datasets:

df.1 <- data.frame(
name = c( "a", "b", "c", "d" ),
x = c( 3, 2, 1, 2 ),
y = c( 4, 3, 4, 3 ),
z = c( 8, 9, 6, 7 ) )

df.2 <- data.frame(
name = c( "o", "p", "q", "r" ),
x = c( 8, 7, 6, 9 ),
y = c( 4, 1, 4, 3 ),
z = c( 1, 2, 2, 2 ) )

And then I melt each of them to group by name

df.1.melted  <- melt( df.1, id.vars = "name" )
df.2.melted  <- melt( df.2, id.vars = "name" )

Now, I want a plot where the x-axis has x, y, and z grouped and the y-axis is the value, with each sample linked by the name already given to it. I can do this for one of the datasets (I want a log scale eventually so it's included):

ggplot( df.1.melted, aes( x = variable, 
                          y = value, 
                          group = df.1.melted$name, 
                          col = df.1.melted$name ) ) +
scale_y_continuous( trans = log_trans(), limits = c( 1, 10 ), 
                    breaks = c( 1, 10 ) ) +
labs( x = "", y = "value" ) +
geom_point( size = 4 ) +
geom_line( size = 1 ) 

Which gives me something reasonable: enter image description here

Then I can add the second data set by:

ggplot( df.1.melted, aes( x = variable, 
                          y = value, 
                          group = df.1.melted$name, 
                          col = df.1.melted$name ) ) +
scale_y_continuous( trans = log_trans(), limits = c( 1, 10 ), 
                    breaks = c( 1, 10 ) ) +
labs( x = "", y = "value" ) +
geom_point( size = 4 ) +
geom_line( size = 1 ) +

geom_point( data = df.2.melted, aes( x = df.2.melted$variable,
                                     y = df.2.melted$value, 
                                     group = df.2.melted$name, 
                                     col = df.2.melted$name ), 
            size = 4 ) +
geom_line( data = df.2.melted, aes( x = df.2.melted$variable,
                                    y = df.2.melted$value, 
                                    group = df.2.melted$name, 
                                    col = df.2.melted$name ), 
           size = 1 ) 

which yields: enter image description here

This is the main theme of what I am after, but I'm running into a few problems: 1) How can I overwrite the default color schemes when using the aes( group = ...) portion? I want to either have predefined colors in the data frame or be able to define them in geom_point(). The colors should be particular to the dataframe that I'm using, so df.1.melted is darkgreen and df.2.melted is orange or something like that. I haven't found how to plot these without using the group = in the aes() call, so I can't find a workaround at the moment.

The solution looks possible, as in the ggplot example in the answer here: R plotly - Plotting grouped lines

But, I am not familiar enough with dplyr to figure out what is going on to create this plot.

Thanks for any advice

Upvotes: 2

Views: 2292

Answers (1)

markus
markus

Reputation: 26343

You could try this

library(ggplot2)
library(dplyr)
df_melted <- bind_rows(df.1.melted, df.2.melted)
df_melted %>% 
 mutate(df = rep(c('df.1', 'df.2'), each = nrow(df_melted) / 2)) %>% 
 ggplot(aes(x = variable,
            y = value,
            col = df)) +
 geom_line(aes(group = name)) +
 geom_point() +
 scale_y_log10(limits = c( 1, 10), 
               breaks = c(1, 10)) +
 scale_color_manual(values = c('df.1' = "forestgreen",
                               'df.2' = "orange"))

enter image description here

The idea is to create one data frame, df_melted, and add the column df that indicates from which data frame the observations came from. Then you can map variable df to the colour aesthetic. As suggested in the comment you can change the default colours using scale_colour_manual.

Upvotes: 2

Related Questions