user432797
user432797

Reputation: 593

ggplot functions to replicate plots

I'm trying to replicate the theme of these graph using ggplot, I searched online to show me how to assign and I found few articles that discussed changing colors of two variables in scatterplot, I tried the following:

d1<-read.csv("./data/games.csv")
p.1<-ggplot(d1, aes(x=cream_rating, y=charcoal_rating)) + 
  geom_point(aes(color = cream_rating))
p.1 + ggtitle("Rating of Cream vs Charcoal") +
  xlab("rating of cream") + ylab("rating of charcoal")+ theme(plot.title = element_text(hjust = 0.5)) + scale_color_manual(
                        values=c("orange", "green"))

I get this error:

ERROR while rich displaying an object: Error: Continuous value supplied to discrete scale

Traceback:
1. FUN(X[[i]], ...)
2. tryCatch(withCallingHandlers({
 .     if (!mime %in% names(repr::mime2repr)) 
 .         stop("No repr_* for mimetype ", mime, " in repr::mime2repr")
 .     rpr <- repr::mime2repr[[mime]](obj)
 .     if (is.null(rpr)) 
 .         return(NULL)
 .     prepare_content(is.raw(rpr), rpr)
 . }, error = error_handler), error = outer_handler)
3. tryCatchList(expr, classes, parentenv, handlers)
4. tryCatchOne(expr, names, parentenv, handlers[[1L]])
5. doTryCatch(return(expr), name, parentenv, handler)
6. withCallingHandlers({
 .     if (!mime %in% names(repr::mime2repr)) 
 .         stop("No repr_* for mimetype ", mime, " in repr::mime2repr")
 .     rpr <- repr::mime2repr[[mime]](obj)
 .     if (is.null(rpr)) 
 .         return(NULL)
 .     prepare_content(is.raw(rpr), rpr)
 . }, error = error_handler)
7. repr::mime2repr[[mime]](obj)
8. repr_text.default(obj)
9. paste(capture.output(print(obj)), collapse = "\n")
10. capture.output(print(obj))
11. evalVis(expr)
12. withVisible(eval(expr, pf))
13. eval(expr, pf)
14. eval(expr, pf)
15. print(obj)
16. print.ggplot(obj)
17. ggplot_build(x)
18. ggplot_build.ggplot(x)
19. lapply(data, scales_train_df, scales = npscales)
20. FUN(X[[i]], ...)
21. lapply(scales$scales, function(scale) scale$train_df(df = df))
22. FUN(X[[i]], ...)
23. scale$train_df(df = df)
24. f(..., self = self)
25. self$train(df[[aesthetic]])
26. f(..., self = self)
27. self$range$train(x, drop = self$drop, na.rm = !self$na.translate)
28. f(..., self = self)
29. scales::train_discrete(x, self$range, drop = drop, na.rm = na.rm)
30. stop("Continuous value supplied to discrete scale", call. = FALSE)

I'm using the wrong function, which one that I should use and how to get the cross line in the middle?

structure(list(rated = c(FALSE, TRUE, TRUE, TRUE, TRUE, FALSE, 
TRUE, FALSE, TRUE, TRUE), turns = c(13L, 16L, 61L, 61L, 95L, 
5L, 33L, 9L, 66L, 119L), victory_status = structure(c(3L, 4L, 
2L, 2L, 2L, 1L, 4L, 4L, 4L, 2L), .Label = c("draw", "mate", "outoftime", 
"resign"), class = "factor"), winner = structure(c(2L, 1L, 2L, 
2L, 2L, 3L, 2L, 1L, 1L, 2L), .Label = c("charcoal", "cream", 
"draw"), class = "factor"), increment_code = structure(c(3L, 
7L, 7L, 5L, 6L, 1L, 1L, 4L, 2L, 1L), .Label = c("10+0", "15+0", 
"15+2", "15+30", "20+0", "30+3", "5+10"), class = "factor"), 
    cream_rating = c(1500L, 1322L, 1496L, 1439L, 1523L, 1250L, 
    1520L, 1413L, 1439L, 1381L), charcoal_rating = c(1191L, 1261L, 
    1500L, 1454L, 1469L, 1002L, 1423L, 2108L, 1392L, 1209L)), row.names = c(NA, 
10L), class = "data.frame")

This is what I want to achieve:enter image description here

I tried Stefan's suggestion (which was great help) with some modifications:

`d1<-read.csv("./data/games.csv")
ggplot(d1, aes(x=cream_rating, y=charcoal_rating)) + 
  # Map winner on color. Add some transparency in case of overplotting
  geom_point(aes(color = winner), alpha = 0.2) +
  # Add the cross: Add geom_pints with one variable fixed on its mean
  geom_point(aes(x = mean(cream_rating), color = winner), alpha = 0.2) +
  geom_point(aes(y = mean(charcoal_rating), color = winner), alpha = 0.2) +
  scale_shape_manual(values=c(16, 17)) +
  # "draw"s should be dropped and removed from the title
  scale_color_manual(values = c(cream = "seagreen4", charcoal = "chocolate3", draw = NA)) +
  ggtitle("Rating of Cream vs Charcoal") +
  xlab("rating of cream") + ylab("rating of charcoal") + theme_bw() + theme(plot.title = element_text(hjust = 0.5)) 

I want to filter out "draw" from the plot, also when I change the dot shapes to triangles and circle, they don't seem to be changing, in addition I get this error:

Warning message:
“Removed 950 rows containing missing values (geom_point).”
Warning message:
“Removed 950 rows containing missing values (geom_point).”
Warning message:
“Removed 950 rows containing missing values (geom_point).”

One more thing that I noticed, I get double cross instead of one!

This is my output: enter image description here

Upvotes: 0

Views: 285

Answers (1)

stefan
stefan

Reputation: 123903

The issue is that you mapped a continuous variable (cream_rating) on a discrete color scale (scale_color_manual).

  1. As the plots in your images show there are only two colors, i.e. we need a discrete variable. As your data is about ratings my guess is that to achieve the plots you have to map winner on color. One question remains: How about draws. In my code below I set the color for draws equal to NA, i.e draws are dropped. But you can change that as you like.

  2. From the image I also guess that some transparency was used to tackle overplotting. This could be achieved via the alpha argument, which I set to 0.6.

  3. Concerning the cross appearing in your plot. Hard to tell, but my guess is that here the data was "replicated" two times by fixing one of your ratings variables to its meanvalue. If this guess is correct, we can get the cross via two additional geom_point layers.

library(ggplot2)

d1 <- structure(list(rated = c(FALSE, TRUE, TRUE, TRUE, TRUE, FALSE, 
                         TRUE, FALSE, TRUE, TRUE), turns = c(13L, 16L, 61L, 61L, 95L, 
                                                             5L, 33L, 9L, 66L, 119L), victory_status = structure(c(3L, 4L, 
                                                                                                                   2L, 2L, 2L, 1L, 4L, 4L, 4L, 2L), .Label = c("draw", "mate", "outoftime", 
                                                                                                                                                               "resign"), class = "factor"), winner = structure(c(2L, 1L, 2L, 
                                                                                                                                                                                                                  2L, 2L, 3L, 2L, 1L, 1L, 2L), .Label = c("charcoal", "cream", 
                                                                                                                                                                                                                                                          "draw"), class = "factor"), increment_code = structure(c(3L, 
                                                                                                                                                                                                                                                                                                                   7L, 7L, 5L, 6L, 1L, 1L, 4L, 2L, 1L), .Label = c("10+0", "15+0", 
                                                                                                                                                                                                                                                                                                                                                                   "15+2", "15+30", "20+0", "30+3", "5+10"), class = "factor"), 
               cream_rating = c(1500L, 1322L, 1496L, 1439L, 1523L, 1250L, 
                                1520L, 1413L, 1439L, 1381L), charcoal_rating = c(1191L, 1261L, 
                                                                                 1500L, 1454L, 1469L, 1002L, 1423L, 2108L, 1392L, 1209L)), row.names = c(NA, 
                                                                                                                                                         10L), class = "data.frame")

ggplot(d1, aes(x=cream_rating, y=charcoal_rating)) + 
  # Map winner on color. Add some transparency in case of overplotting
  geom_point(aes(color = winner), alpha = 0.6) +
  # Just a guess to add the cross: Add geom_pints with one variable fixed on its mean
  geom_point(aes(x = mean(cream_rating), color = winner), alpha = 0.6) +
  geom_point(aes(y = mean(charcoal_rating), color = winner), alpha = 0.6) +
  # Should "draw"s be colored or dropped?
  scale_color_manual(values = c(cream = "green", charcoal = "orange", draw = NA)) +
  ggtitle("Rating of Cream vs Charcoal") +
  xlab("rating of cream") + ylab("rating of charcoal")+ theme(plot.title = element_text(hjust = 0.5))

EDIT

  1. the shapes don't show up because you missed to map winner on the shape aes

  2. the "errors" are simply warnings which arise because we set the color for draws to NA. These are the rows which ggplot removes. To get rid of the draws simply filter your dataset before plotting:

library(ggplot2)
library(dplyr)

d1 %>% 
  filter(winner != "draw") %>% 
  ggplot(aes(x=cream_rating, y=charcoal_rating, color = winner, shape = winner)) + 
  # Map winner on color. Add some transparency in case of overplotting
  geom_point(alpha = 0.6, na.rm = TRUE) +
  # Just a guess to add the cross: Add geom_pints with one variable fixed on its mean
  geom_point(aes(x = mean(cream_rating)), alpha = 0.6) +
  geom_point(aes(y = mean(charcoal_rating)), alpha = 0.6) +
  # Should "draw"s be colored or dropped?
  scale_color_manual(values = c(cream = "green", charcoal = "orange")) +
  scale_shape_manual(values = c(cream = 16, charcoal = 17)) +
  ggtitle("Rating of Cream vs Charcoal") +
  xlab("rating of cream") + ylab("rating of charcoal")+ theme(plot.title = element_text(hjust = 0.5))

Upvotes: 1

Related Questions