Reputation: 593
I'm trying to replicate the theme of these graph using ggplot, I searched online to show me how to assign and I found few articles that discussed changing colors of two variables in scatterplot, I tried the following:
d1<-read.csv("./data/games.csv")
p.1<-ggplot(d1, aes(x=cream_rating, y=charcoal_rating)) +
geom_point(aes(color = cream_rating))
p.1 + ggtitle("Rating of Cream vs Charcoal") +
xlab("rating of cream") + ylab("rating of charcoal")+ theme(plot.title = element_text(hjust = 0.5)) + scale_color_manual(
values=c("orange", "green"))
I get this error:
ERROR while rich displaying an object: Error: Continuous value supplied to discrete scale
Traceback:
1. FUN(X[[i]], ...)
2. tryCatch(withCallingHandlers({
. if (!mime %in% names(repr::mime2repr))
. stop("No repr_* for mimetype ", mime, " in repr::mime2repr")
. rpr <- repr::mime2repr[[mime]](obj)
. if (is.null(rpr))
. return(NULL)
. prepare_content(is.raw(rpr), rpr)
. }, error = error_handler), error = outer_handler)
3. tryCatchList(expr, classes, parentenv, handlers)
4. tryCatchOne(expr, names, parentenv, handlers[[1L]])
5. doTryCatch(return(expr), name, parentenv, handler)
6. withCallingHandlers({
. if (!mime %in% names(repr::mime2repr))
. stop("No repr_* for mimetype ", mime, " in repr::mime2repr")
. rpr <- repr::mime2repr[[mime]](obj)
. if (is.null(rpr))
. return(NULL)
. prepare_content(is.raw(rpr), rpr)
. }, error = error_handler)
7. repr::mime2repr[[mime]](obj)
8. repr_text.default(obj)
9. paste(capture.output(print(obj)), collapse = "\n")
10. capture.output(print(obj))
11. evalVis(expr)
12. withVisible(eval(expr, pf))
13. eval(expr, pf)
14. eval(expr, pf)
15. print(obj)
16. print.ggplot(obj)
17. ggplot_build(x)
18. ggplot_build.ggplot(x)
19. lapply(data, scales_train_df, scales = npscales)
20. FUN(X[[i]], ...)
21. lapply(scales$scales, function(scale) scale$train_df(df = df))
22. FUN(X[[i]], ...)
23. scale$train_df(df = df)
24. f(..., self = self)
25. self$train(df[[aesthetic]])
26. f(..., self = self)
27. self$range$train(x, drop = self$drop, na.rm = !self$na.translate)
28. f(..., self = self)
29. scales::train_discrete(x, self$range, drop = drop, na.rm = na.rm)
30. stop("Continuous value supplied to discrete scale", call. = FALSE)
I'm using the wrong function, which one that I should use and how to get the cross line in the middle?
structure(list(rated = c(FALSE, TRUE, TRUE, TRUE, TRUE, FALSE,
TRUE, FALSE, TRUE, TRUE), turns = c(13L, 16L, 61L, 61L, 95L,
5L, 33L, 9L, 66L, 119L), victory_status = structure(c(3L, 4L,
2L, 2L, 2L, 1L, 4L, 4L, 4L, 2L), .Label = c("draw", "mate", "outoftime",
"resign"), class = "factor"), winner = structure(c(2L, 1L, 2L,
2L, 2L, 3L, 2L, 1L, 1L, 2L), .Label = c("charcoal", "cream",
"draw"), class = "factor"), increment_code = structure(c(3L,
7L, 7L, 5L, 6L, 1L, 1L, 4L, 2L, 1L), .Label = c("10+0", "15+0",
"15+2", "15+30", "20+0", "30+3", "5+10"), class = "factor"),
cream_rating = c(1500L, 1322L, 1496L, 1439L, 1523L, 1250L,
1520L, 1413L, 1439L, 1381L), charcoal_rating = c(1191L, 1261L,
1500L, 1454L, 1469L, 1002L, 1423L, 2108L, 1392L, 1209L)), row.names = c(NA,
10L), class = "data.frame")
This is what I want to achieve:
I tried Stefan's suggestion (which was great help) with some modifications:
`d1<-read.csv("./data/games.csv")
ggplot(d1, aes(x=cream_rating, y=charcoal_rating)) +
# Map winner on color. Add some transparency in case of overplotting
geom_point(aes(color = winner), alpha = 0.2) +
# Add the cross: Add geom_pints with one variable fixed on its mean
geom_point(aes(x = mean(cream_rating), color = winner), alpha = 0.2) +
geom_point(aes(y = mean(charcoal_rating), color = winner), alpha = 0.2) +
scale_shape_manual(values=c(16, 17)) +
# "draw"s should be dropped and removed from the title
scale_color_manual(values = c(cream = "seagreen4", charcoal = "chocolate3", draw = NA)) +
ggtitle("Rating of Cream vs Charcoal") +
xlab("rating of cream") + ylab("rating of charcoal") + theme_bw() + theme(plot.title = element_text(hjust = 0.5))
I want to filter out "draw" from the plot, also when I change the dot shapes to triangles and circle, they don't seem to be changing, in addition I get this error:
Warning message:
“Removed 950 rows containing missing values (geom_point).”
Warning message:
“Removed 950 rows containing missing values (geom_point).”
Warning message:
“Removed 950 rows containing missing values (geom_point).”
One more thing that I noticed, I get double cross instead of one!
Upvotes: 0
Views: 285
Reputation: 123903
The issue is that you mapped a continuous variable (cream_rating
) on a discrete color scale (scale_color_manual
).
As the plots in your images show there are only two colors, i.e. we need a discrete variable. As your data is about ratings my guess is that to achieve the plots you have to map winner
on color. One question remains: How about draw
s. In my code below I set the color for draws equal to NA, i.e draws are dropped. But you can change that as you like.
From the image I also guess that some transparency was used to tackle overplotting. This could be achieved via the alpha
argument, which I set to 0.6.
Concerning the cross appearing in your plot. Hard to tell, but my guess is that here the data was "replicated" two times by fixing one of your ratings variables to its mean
value. If this guess is correct, we can get the cross via two additional geom_point
layers.
library(ggplot2)
d1 <- structure(list(rated = c(FALSE, TRUE, TRUE, TRUE, TRUE, FALSE,
TRUE, FALSE, TRUE, TRUE), turns = c(13L, 16L, 61L, 61L, 95L,
5L, 33L, 9L, 66L, 119L), victory_status = structure(c(3L, 4L,
2L, 2L, 2L, 1L, 4L, 4L, 4L, 2L), .Label = c("draw", "mate", "outoftime",
"resign"), class = "factor"), winner = structure(c(2L, 1L, 2L,
2L, 2L, 3L, 2L, 1L, 1L, 2L), .Label = c("charcoal", "cream",
"draw"), class = "factor"), increment_code = structure(c(3L,
7L, 7L, 5L, 6L, 1L, 1L, 4L, 2L, 1L), .Label = c("10+0", "15+0",
"15+2", "15+30", "20+0", "30+3", "5+10"), class = "factor"),
cream_rating = c(1500L, 1322L, 1496L, 1439L, 1523L, 1250L,
1520L, 1413L, 1439L, 1381L), charcoal_rating = c(1191L, 1261L,
1500L, 1454L, 1469L, 1002L, 1423L, 2108L, 1392L, 1209L)), row.names = c(NA,
10L), class = "data.frame")
ggplot(d1, aes(x=cream_rating, y=charcoal_rating)) +
# Map winner on color. Add some transparency in case of overplotting
geom_point(aes(color = winner), alpha = 0.6) +
# Just a guess to add the cross: Add geom_pints with one variable fixed on its mean
geom_point(aes(x = mean(cream_rating), color = winner), alpha = 0.6) +
geom_point(aes(y = mean(charcoal_rating), color = winner), alpha = 0.6) +
# Should "draw"s be colored or dropped?
scale_color_manual(values = c(cream = "green", charcoal = "orange", draw = NA)) +
ggtitle("Rating of Cream vs Charcoal") +
xlab("rating of cream") + ylab("rating of charcoal")+ theme(plot.title = element_text(hjust = 0.5))
EDIT
the shapes don't show up because you missed to map winner
on the shape
aes
the "errors" are simply warnings which arise because we set the color for draw
s to NA. These are the rows which ggplot removes. To get rid of the draw
s simply filter your dataset before plotting:
library(ggplot2)
library(dplyr)
d1 %>%
filter(winner != "draw") %>%
ggplot(aes(x=cream_rating, y=charcoal_rating, color = winner, shape = winner)) +
# Map winner on color. Add some transparency in case of overplotting
geom_point(alpha = 0.6, na.rm = TRUE) +
# Just a guess to add the cross: Add geom_pints with one variable fixed on its mean
geom_point(aes(x = mean(cream_rating)), alpha = 0.6) +
geom_point(aes(y = mean(charcoal_rating)), alpha = 0.6) +
# Should "draw"s be colored or dropped?
scale_color_manual(values = c(cream = "green", charcoal = "orange")) +
scale_shape_manual(values = c(cream = 16, charcoal = 17)) +
ggtitle("Rating of Cream vs Charcoal") +
xlab("rating of cream") + ylab("rating of charcoal")+ theme(plot.title = element_text(hjust = 0.5))
Upvotes: 1