user19422662
user19422662

Reputation:

R: How to Fill Points in ggplot2 with a variable

I am attempting to make a ggplot2 scatter plot that is grouped by bins in R. I successfully made the first model, which I did not try to alter the fill for. But when I tried to have the fill of the scatter plot be based upon my variable (Miss.) ,which is a numeric value ranging from 0.00 to 0.46, it essentially ignores the heat map scale and turns everything gray.

   ggplot(data = RightFB, mapping = aes(x = TMHrzBrk, y = TMIndVertBrk))+
   geom_bin_2d(bins = 15)+
   scale_fill_continuous(type = "viridis")+
   ylim(5, 20)+
   xlim(0,15)+
   coord_fixed(1.3)


   ggplot(data = RightFB, mapping = aes(x = TMHrzBrk, y = TMIndVertBrk, fill 
   =Miss.))+
   geom_bin_2d(bins = 15)+
   scale_fill_continuous(type = "viridis")+
   ylim(5, 20)+
   xlim(0,15)+
   coord_fixed(1.3)

I appreciate any help! Thanks!

Upvotes: 0

Views: 475

Answers (1)

Allan Cameron
Allan Cameron

Reputation: 173998

I think I understand your problem, so let's replicate it with a reproducible example. Obviously we don't have your data, but the following data frame has the same names, types and ranges as your own data, so this walk-through should work for you.

set.seed(1)

RightFB <- data.frame(TMHrzBrk = runif(1000, 0, 15),
                      TMIndVertBrk = runif(1000, 5, 20),
                      Miss. = runif(1000, 0, 0.46))

Your first plot will look something like this:

library(tidyverse)

ggplot(data = RightFB, mapping = aes(x = TMHrzBrk, y = TMIndVertBrk)) +
  geom_bin_2d(bins = 15) +
  scale_fill_continuous(type = "viridis") +
  ylim(5, 20) +
  xlim(0, 15) +
  coord_fixed(1.3)
#> Warning: Removed 56 rows containing missing values (`geom_tile()`).

Here, the fill colors represent the counts of observations within each bin. But if you try to map the fill to Miss., you get all gray squares:

ggplot(data = RightFB, mapping = aes(x = TMHrzBrk, y = TMIndVertBrk,
                                     fill = Miss.)) +
  geom_bin_2d(bins = 15) +
  scale_fill_continuous(type = "viridis") +
  ylim(5, 20) +
  xlim(0, 15) +
  coord_fixed(1.3)
#> Warning: The following aesthetics were dropped during statistical transformation: fill
#> i This can happen when ggplot fails to infer the correct grouping structure in
#>   the data.
#> i Did you forget to specify a `group` aesthetic or to convert a numerical
#>   variable into a factor?
#> Removed 56 rows containing missing values (`geom_tile()`).

The reason this happens is that by default geom_bin_2d calculates the bins and the counts within each bin to get the fill variable. There are multiple observations within each bin, and they all have a different value of Miss. . Furthermore, geom_bin_2d doesn't know what you want to do with this variable. My guess is that you are looking for the average of Miss. within each bin, but this is difficult to achieve within the framework of geom_bin_2d.

The alternative is to calculate the bins yourself, get the average of Miss. in each bin, and plot as a geom_tile

RightFB %>%
  mutate(TMHrzBrk = cut(TMHrzBrk, breaks = seq(0, 15, 1), seq(0.5, 14.5, 1)),
         TMIndVertBrk = cut(TMIndVertBrk, seq(5, 20, 1), seq(5.5, 19.5, 1))) %>%
  group_by(TMHrzBrk, TMIndVertBrk) %>%
  summarize(Miss. = mean(Miss., na.rm = TRUE), .groups = "drop") %>%
  mutate(across(TMHrzBrk:TMIndVertBrk, ~as.numeric(as.character(.x)))) %>%
  ggplot(aes(x = TMHrzBrk, y = TMIndVertBrk, fill = Miss.)) +
  geom_tile() +
  scale_fill_continuous(type = "viridis") +
  ylim(5, 20) +
  xlim(0, 15) +
  coord_fixed(1.3)


EDIT

With the link to the data in the comments, here is a full reprex:

library(tidyverse)

RightFB <- read.csv(paste0("https://raw.githubusercontent.com/rileyfeltner/",
                           "FB-Analysis/main/Right%20FB.csv"))

RightFB <- RightFB[c(2:6, 9, 11, 13, 18, 19)]
RightFB$Miss. <- as.numeric(as.character(RightFB$Miss.))
#> Warning: NAs introduced by coercion
RightFB$TMIndVertBrk <- as.numeric(as.character(RightFB$TMIndVertBrk))
#> Warning: NAs introduced by coercion
RightFB <- na.omit(RightFB)
RightFB1 <- filter(RightFB, P > 24)

RightFB %>%
  mutate(TMHrzBrk = cut(TMHrzBrk, breaks = seq(0, 15, 1), seq(0.5, 14.5, 1)),
         TMIndVertBrk = cut(TMIndVertBrk, seq(5, 20, 1), seq(5.5, 19.5, 1))) %>%
  group_by(TMHrzBrk, TMIndVertBrk) %>%
  summarize(Miss. = mean(Miss., na.rm = TRUE), .groups = "drop") %>%
  mutate(across(TMHrzBrk:TMIndVertBrk, ~as.numeric(as.character(.x)))) %>%
  ggplot(aes(x = TMHrzBrk, y = TMIndVertBrk, fill = Miss.)) +
  geom_tile() +
  scale_fill_continuous(type = "viridis") +
  ylim(5, 20) +
  xlim(0, 15) +
  coord_fixed(1.3)
#> Warning: Removed 18 rows containing missing values (`geom_tile()`).

Created on 2022-11-23 with reprex v2.0.2

Upvotes: 0

Related Questions