Reputation:
I'm creating a heatmap in R and new to the language. Trying to reorder the y axis on this heatmap so that the cities with the most yearly occurrences show up at the top and the least at the bottom, but getting loads of errors, and even once I got past them it didn't change anything. I've tried lots of things already so figured it may be worth asking.
The only relevant variable names: Month_num
and Australian_City
. Here's what I've got:
# I've included my discarded ideas too, as comments
require(ggplot2)
require(dplyr)
add_count(flights, Australian_City)
ggplot(flights, aes(x=Month_num %>% reorder(count.Freq), y=Australian_City)) + geom_bin2d() + scale_x_discrete(labels=c("Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec")) + labs(x="Month", y="Flights per city") + ggtitle("Monthly International Flights Per City")
#city_counts = flights %>% group_by(Australian_City) %>% count()
#ave(age, gender, FUN = length))
#flights %>% mutate(num_by_city=ave(Australian_City, FUN=length))
#flights$Australian_City <- flights$Australian_City %>% reorder(flights$n)
#flights <- transform(flights, count=table(Australian_City)[Australian_City])
#flights %>% mutate(num_by_city= case_when(city_counts$Australian_City==Australian_City ~ city_counts$n))
#flights %>% mutate(visit_count = sum(flights$))
I can see one of those discarded ideas working, but I've got no idea how :(. Both Month_num
and Australian_City
are factors, but the Month is stored as integers 1
through 12
. Any help would be appreciated!
Upvotes: 0
Views: 587
Reputation: 2414
I tried to reproduce your situation. Create dataset:
require(ggplot2)
require(dplyr)
library(tidyr)
Adelaide <- sample(1:300, 12, replace=TRUE)
Darwin <- sample(1:300, 12, replace=TRUE)
Calms <- sample(1:300, 12, replace=TRUE)
Canberra <- sample(1:300, 12, replace=TRUE)
Melbourne <- sample(1:300, 12, replace=TRUE)
data <- data.frame(Adelaide, Darwin, Calms, Canberra, Melbourne)
data$Month <- format(ISOdatetime(2000,1:12,1,0,0,0),"%b")
Adelaide Darwin Calms Canberra Melbourne Month
1 91 148 10 246 45 gen
2 175 156 247 118 1 feb
3 244 232 18 287 74 mar
4 123 5 75 194 136 apr
5 142 267 19 155 75 mag
6 166 292 263 266 187 giu
7 18 72 61 83 197 lug
8 294 97 69 15 3 ago
9 234 135 80 8 267 set
10 181 134 54 64 203 ott
11 232 197 50 145 39 nov
12 177 20 68 32 299 dic
Then gather it:
data <- gather(data, "City","Count",1:5) # change 5 with your actual number of cities
data$Month <- as.character(data$Month)
data$Month <- factor(data$Month, levels=unique(data$Month))
data$City <- as.character(data$City)
data$City <- factor(data$City, levels=unique(data$City))
Month City Count
1 gen Adelaide 91
2 feb Adelaide 175
3 mar Adelaide 244
4 apr Adelaide 123
5 mag Adelaide 142
6 giu Adelaide 166
7 lug Adelaide 18
8 ago Adelaide 294
9 set Adelaide 234
10 ott Adelaide 181
11 nov Adelaide 232
12 dic Adelaide 177
13 gen Darwin 148
14 feb Darwin 156
15 mar Darwin 232
16 apr Darwin 5
17 mag Darwin 267
18 giu Darwin 292
.. ... ..... ...
Then plot the heatmap (not ordered):
ggplot(data, aes(x= Month, y = City , fill= Count)) + geom_tile()
Finally, you can arrange the rows in a way that cities with the most yearly occurrences show up at the top:
Calms Melbourne Canberra Darwin Adelaide
1014 1526 1613 1755 2077
ggplot(data, aes(x= Month, y = reorder(City, Count) , fill= Count)) + geom_tile()
Upvotes: 1