Lutema
Lutema

Reputation: 13

ggplot2 Unable to fill histogram with color by column

So, I am trying to compare count of a variable by frequency of another, so I have

ggplot(data = Q11b) +
 geom_histogram(mapping = aes(x = WEIGHT2, fill = EDUCA), binwidth = 5)

In this, WEIGHT2 is continuous, and what I'm trying to have the fill based on 'EDUCA', which is a column of number 1-6. So, I am expecting the histogram to show a plot of the distribution with 6 colors in a legends, like I've seen in many examples. Instead, I get this:

enter image description here

Do you guys have any idea where I went wrong? I tried a geom_freqpoly() and had the same issue. I also tried changing the EDUCA part to a different column and couldn't see what I was doing wrong. Any help would be appreciated

Upvotes: 0

Views: 258

Answers (1)

Allan Cameron
Allan Cameron

Reputation: 173813

The following data set should approximate yours according to your description:

set.seed(1)

Q11b <- subset(data.frame(WEIGHT2 = rgamma(1e5, 3) * 20 + 100,
                          EDUCA = sample(6, 1e5, TRUE)), WEIGHT2 < 320)

And we get a similar result using your code (as well as a warning being emitted):

library(ggplot2) 

ggplot(data = Q11b) +
  geom_histogram(mapping = aes(x = WEIGHT2, fill = EDUCA), binwidth = 5)
#> Warning: The following aesthetics were dropped during statistical transformation: fill
#> i This can happen when ggplot fails to infer the correct grouping structure in
#>   the data.
#> i Did you forget to specify a `group` aesthetic or to convert a numerical
#>   variable into a factor?

If we simply follow the advice and convert EDUCA to a factor, we get our colored groups:

ggplot(data = Q11b) +
  geom_histogram(mapping = aes(x = WEIGHT2, fill = factor(EDUCA)), binwidth = 5)

Created on 2023-02-13 with reprex v2.0.2

Upvotes: 1

Related Questions