Reputation: 417
I've used the following way to create 3 histograms. The 4th one has suddenly a reverse order on the x-axis. However, there's nothing (at least nothing I know about) in the snippet that should affect the order.
The x-axis is expected to start with the lowest value on the left.
Here's the R code:
df <- mydata %>% mutate(length.class=cut(mydata$count,breaks = c(1,10,100,1000,10000,100000,1000000,10000000),include.lowest=TRUE,dig.lab=8)) %>% group_by(length.class) %>% summarise(count = n())
dftext <- as.data.frame(table(df$length.class))
colnames(dftext)[1] <- "x"
dftext$lab[dftext$x == "[1,10]"] <- 1063393
dftext$lab[dftext$x == "(10,100]"] <- 65986
dftext$lab[dftext$x == "(100,1000]"] <- 3206
dftext$lab[dftext$x == "(1000,10000]"] <- 386
dftext$lab[dftext$x == "(10000,100000]"] <- 32
dftext$lab[dftext$x == "(100000,1000000]"] <- 0
dftext$lab[dftext$x == "(1000000,10000000]"] <- 1
df$count[df$length.class == "(1000000,10000000]"] <- 1.1 // To make its bar visible
fmt <- function(decimals=0){
function(x) format(x,scientific = FALSE)
}
ggplot(df,aes(length.class,count)) + geom_bar(stat = "identity",width=0.9,fill="#999966") + scale_y_log10(labels = fmt()) + labs(x="", y="") + geom_text(data=dftext, aes(x=x, y=2, label=lab), size = 6) + theme(text = element_text(size=20)) +
theme(axis.line = element_line(colour = "black"),
panel.grid.major = element_line(color = "grey"),
panel.grid.minor = element_line(color = "grey"),
panel.background = element_blank(),
axis.title.x = element_text(margin=margin(t = 15, unit = "pt")),
axis.text.x = element_text(angle = 45, hjust = 1))
What is causing the reverse order and how can I get rid of it?
Edit: You guys are fast! :) The answer of @mark-peterson looks pretty solid, however I didn't get any working results with it though. Here's the requested data: mydata.csv
Upvotes: 1
Views: 2781
Reputation: 9560
When given text labels, geom_bar
converts to a factor and sorts the bars. My guess it that alphabetical and numerical matched up for your previous uses, but did not for this one. I thought that @Pierre was right about scale_x_reverse()
, but it doesn't appear to work on factors. Instead, you will need to set the factor orders yourself. Without sample data, it is hard to help do that.
A better question, however, is why you are doing so much work by hand here. The tools exist to automate much of your set up, with the added benefit of reducing errors and sorting the factor correctly. For example, with some reproducible data:
temp <- data.frame(a = 1:999)
temp$binned <-
cut(temp$a, 10^(0:3), include.lowest = TRUE)
forText <-
table(temp$binned) %>%
as.data.frame()
ggplot(temp, aes(x = binned)) +
geom_bar() +
geom_text(data = forText
, aes(x = Var1
, y = 75
, label = Freq))
If you just want a picture of the distribution, you can be even faster with a histogram:
ggplot(temp, aes(a)) +
geom_histogram() +
scale_x_log10()
(Also, in the future, try to strip down to an MWE -- no need to include lots of theme
settings if they are not germane to the problem.)
Using the posted data, I got the plot to work with my approach above. Note that you would need to add the additional theme and scale arguments. You also need to make use of @aosmith's answer about the missing value. (Which, I think, means that @aosmith's answer actually answers your question, while mine may be just good advice for how to do this more quickly.)
mydata$binned <-
cut(mydata$count,breaks = c(1,10,100,1000,10000,100000,1000000,10000000),include.lowest=TRUE,dig.lab=8)
forText <-
table(mydata$binned) %>%
as.data.frame()
ggplot(mydata, aes(x = binned)) +
geom_bar() +
geom_text(data = forText
, aes(x = Var1
, y = 75
, label = Freq)) +
scale_x_discrete(drop = FALSE)
Upvotes: 1
Reputation: 36076
Your two datasets have the same levels of the factors length.class
and x
, but there is no row for (100000,1000000]
in your first dataset, df. This is because summarise
has no drop = FALSE
option to keep all levels of a factor in the dataset regardless of if they have any observations.
As you built your plot using the dataset with fewer factors in the rows, it looks like ggplot2 gets confused when you add the new layer that has more factor levels and things get ordered oddly.
A fix is to make sure the x axis doesn't drop any factor levels by using drop = FALSE
in scale_x_discrete
. That way you will be working with the same factor levels for the x axis for both datasets and things won't get mis-ordered.
+ scale_x_discrete(drop = FALSE)
Upvotes: 3