Reputation: 765
I'm trying to learn how to generate heat maps in R, so sorry if these questions seem really basic. Let's say I have this table (a bit contrived, but I'm just trying to practice here):
NumHours FavePet FaveFood
1 3 Cat Burger
2 2 Cat Pizza
3 5 Fish Pizza
4 2 Dog Pizza
5 4 Fish Apple
6 3 Dog Burger
7 3 Cat Pizza
8 1 Cat Burger
9 6 Dog Apple
The dput structure is below:
structure(list(NumHours = c(3L, 2L, 5L,2L, 4L, 3L, 3L, 1L, 6L),
FavePet = structure(c(2L, 2L, 3L, 1L, 3L, 1L, 2L, 2L, 1L),
.Label = c("Dog", "Cat", "Fish"), class = "factor"),
FaveFood = structure(c(3L, 2L, 2L, 2L, 1L, 3L, 2L, 3L, 1L),
.Label = c("Apple", "Pizza", "Burger"), class = "factor")),
.Names = c("NumHours", "FavePet", "FaveFood"), row.names = c(NA, 9L), class = "data.frame")
I'd like to generate a heat map where FaveFood is on the x-axis, FavePet is on the y-axis, and the average number of hours for the pair is the intensity of the color. For example, since there are two "Cat Pizza" values (2, 3), then a color corresponding to 2.5 would be plotted, and this would be lighter than the value of Dog Apple, which has a value of 6.
So far, I have the following, which creates the correct structure, but doesn't incorporate averages (not sure where to put it... it's probably something like fun.y = mean, but I'm not applying it to y or x, so I don't know how to call it).
ggplot(df, aes(x=FaveFood, y=FavePet, fill=as.factor(NumHours))) + geom_tile(aes(color="white"))
I'd also like the colors to range from yellow to red, based on the value so I added
+ scale_fill_gradient(low="yellow", high="red")
But this leads to this error, which I'm not sure how to fix.
Error: Discrete value supplied to continuous scale
Your help is really appreciated! I'd like to learn how to do this properly :)
Upvotes: 1
Views: 138
Reputation: 452
First, you could use the mutate
function inside dplyr
to generate a new variable, called AvgHours, which computes the mean of pairs of FavePet and FaveFood.
df <- df %>% group_by(FavePet,FaveFood) %>% mutate(AvgHours = mean(NumHours))
Then you can use ggplot's geom_tile
to plot the desired heatmap.
ggplot(df, aes(FaveFood,FavePet)) + geom_tile(aes(fill = AvgHours)) + scale_fill_gradient(low = "yellow", high = "red")
Upvotes: 0
Reputation: 1344
Try a basic heatmap like:
ggplot(df, aes(FaveFood, FavePet)) +
geom_tile(aes(fill = NumHours), colour = "black") +
scale_fill_gradient(name = "NumHours", low = "yellow", high = "red") +
labs(title = "Heatmap FaveFood and FavePet")+
labs(x = "FaveFood", y = "FavePet")
There is a reason that you get the error:
Error: Discrete value supplied to continuous scale
This is because you try to make a gradient with your scale_fill_gradient. However, you just made a factor out of your numeric values with fill=as.factor(NumHours). R cannot make a gradient out of a factor so that is were it went wrong.
Good luck!
Upvotes: 1