Reputation: 976
I am trying to make a heatmap using ggplot2 package. I have trouble controlling the colors and breaks on the heatmap. I have 18 questions, 22 firms and the meanvalue of the firms responses on a 1 to 5 scale.
Say i would want values (0-1)(1-2)(2-3)(3-4)(4-5) to be color coded. Either with different colors (Blue, Green, Red, Yellow, Purple) or on a gradient scale. And also NA values = Black. Short: How do i choose colors and breaks?
I would also like to fix the order on the axis to "Question1, Question2...Question18". Likewise for the firms. At this moment I believe it is of class "factor" that causes this problem.
> head(mydf, 20)
Firm Question Value
1 1 Question1 3.6675482217047
2 1 Question2 3.74327628361858
3 1 Question3 <NA>
4 1 Question4 <NA>
5 1 Question5 <NA>
6 1 Question6 <NA>
7 1 Question7 0.352078239608802
8 1 Question8 3.04180471049169
9 1 Question9 3.9559090659924
10 1 Question10 <NA>
11 1 Question11 1
12 1 Question12 4.26591296778731
13 1 Question13 3.95256943635996
14 1 Question14 0.465686274509804
15 1 Question15 2.61764705882353
16 1 Question16 1.83333333333333
17 1 Question17 <NA>
18 1 Question18 0.225490196078431
19 2 Question1 3.85714285714286
20 2 Question2 4
> ggplot(mydf, aes(Question, Firm, fill=Value)) + geom_tile() + theme(axis.text.x = element_text(angle=330, hjust=0))
https://i.sstatic.net/BBb3x.jpg Link to picture of my current plot.
Upvotes: 1
Views: 1179
Reputation: 3991
The root of your problem appears to be that Value
is a factor, rather than a numeric vector. I infer this based on the fact that in the head()
output NA
values are written as <NA>
, which I assume is how they were written in your original spreadsheet, but is not default behavior for R. The image you link to is ggplot's default behavior for coloring based on a factor; the default coloration for numeric is much closer to what you want.
You can check if this in indeed the case by using class$mydf$Value
. If it is indeed a factor, convert it to numeric with the following:
mydf$Value <-as.numeric(as.character(mydf$Value))
Your plotting code as written will now return a graph which looks like this:
You can play around with the exact visualization using the gradient scale, or add a manual scale.
As for your other question, reordering that factor is quite simple. Adapted From R bloggers:
mydf$Question <- factor(mydf$Question, levels(mydf$Question)[c(1,10:18,2:9)])
Upvotes: 1