stats_noob
stats_noob

Reputation: 5907

R: Adding Two Series to a Graph

Using the following website (http://r-statistics.co/Top50-Ggplot2-Visualizations-MasterList-R-Code.html), I made the graph below:

mtcars$`car name` <- rownames(mtcars)  # create new column for car names
mtcars$mpg_z <- round((mtcars$mpg - mean(mtcars$mpg))/sd(mtcars$mpg), 2)  # compute normalized mpg
mtcars$mpg_type <- ifelse(mtcars$mpg_z < 0, "below", "above")  # above / below avg flag
mtcars <- mtcars[order(mtcars$mpg_z), ]  # sort
mtcars$`car name` <- factor(mtcars$`car name`, levels = mtcars$`car name`)  # convert to factor to retain sorted order in plot.

library(ggplot2)
theme_set(theme_bw())

# Plot
ggplot(mtcars, aes(x=`car name`, y=mpg_z, label=mpg_z)) + 
  geom_point(stat='identity', aes(col=mpg_type), size=6)  +
  scale_color_manual(name="Mileage", 
                     labels = c("Above Average", "Below Average"), 
                     values = c("above"="#00ba38", "below"="#f8766d")) + 
  geom_text(color="white", size=2) +
  labs(title="Diverging Dot Plot", 
       subtitle="Normalized mileage from 'mtcars': Dotplot") + 
  ylim(-2.5, 2.5) +
  coord_flip()

enter image description here

My Question: I want to modify the above graph so that there are "2 dots" (green and red) on each horizontal line, representing the values of two different variables.

I created a data set for this example:

my_data = data.frame(var_1_col = "red", var_2_col = "green", var_1 = rnorm(8,10,10), var_2 = rnorm(8,5,1), name = c("A", "B", "C", "D", "E", "F", "G", "H"))

  var_1_col var_2_col     var_1    var_2 name
1       red     green 14.726642 4.676161    A
2       red     green 11.011187 4.937376    B
3       red     green 12.418489 5.869617    C
4       red     green 21.935154 5.641106    D
5       red     green 20.209498 6.193123    E
6       red     green -5.339944 5.187093    F
7       red     green 20.540806 3.895683    G
8       red     green 21.619631 4.097438    H

Then, I tried to create the graph - but it comes out as empty:

# Plot
ggplot(my_data, aes(x=name, y=var_1, label=name)) + 
  geom_point(stat='identity', aes(col=var_1_col), size=6)  +
  scale_color_manual(name="Var 1 or Var 2", 
                     labels = c("Var 1", "Var 2"), 
                     values = c("Var 1"="#00ba38", "Var 2"="#f8766d")) + 
  geom_text(color="white", size=2) +
  labs(title="Plot", 
       subtitle="Plot: Dotplot") + 
  ylim(-2.5, 2.5) +
  coord_flip()

enter image description here

Ideally, I would like the graph to look something like this:

enter image description here

Can someone please show me how to do this?

Thanks!

Note: var_1 could be some variable like "average fuel price" and var_2 could be "median fuel price"

Upvotes: 1

Views: 258

Answers (2)

AndrewGB
AndrewGB

Reputation: 16856

I recommend putting the data into a long format, as it is the preference when plotting with ggplot2. So, I would just drop the two color columns as you can just set that in scale_color_manual. Then, in aes for geom_point, we can set that we want the two variables to be colored different (i.e., as their own group). Then, we can still set all of the labels, names, and colors in scale_color_manual.

library(tidyverse)

my_data %>%
  select(-c(var_1_col, var_2_col)) %>%
  pivot_longer(-name, names_to = "variable", values_to = "value") %>%
  ggplot(., aes(x = name, y = value, label = name)) +
  geom_point(stat = 'identity', aes(color = variable), size = 6)  +
  scale_color_manual(
    name = "Var 1 or Var 2",
    labels = c("Var 1", "Var 2"),
    values = c("#00ba38", "#f8766d")
  ) +
  labs(title = "Plot",
       subtitle = "Plot: Dotplot") +
  coord_flip() +
  theme_bw()

Output

enter image description here

Upvotes: 2

vintro
vintro

Reputation: 21

I want to modify [...], representing the values of two different variables.

If you're looking to plot two different variables on the same graph (and they share a common axis like the names in this case), you can construct two separate geom_point arguments.

ggplot(my_data) +
  geom_point(aes(x=name, y=var_1, col=var_1_col)) +
  geom_point(aes(x=name, y=var_2, col=var_2_col)) +
  coord_flip()

You don't always have to define the axes/colors/labels in the initial ggplot function. By only specifying the dataset, then you can be flexible with the variables you use in the following graph-specific functions. That's how you can construct multiple graphs on one plot :)

see the resultant plot here

Upvotes: 2

Related Questions