CoderGuy123
CoderGuy123

Reputation: 6649

ggplot2 geom_bar, ticks and limits

I have the following function:

miss.case = function(x){
  y = apply(x, 1, is.na)
  y = apply(y, 2, sum)
  return(y)
}
miss.hist = function(df, percent=T) {
  m = miss.case(df)
  d = data.frame(number.of.NA = m)
  max.miss = max(m)
  min.miss = min(m)

  if (percent) {
    d$percent = (d$number.of.NA/sum(d$number.of.NA))*100
    g = ggplot(data = d, aes(x = number.of.NA)) +
      geom_bar(aes(y = ((..count..)/sum(..count..))*100)) + 
      scale_y_continuous('percent') +
      xlab("Number of NAs") +
      scale_x_discrete(breaks=min.miss:max.miss)
    return(g)
  }
  else {
    g = ggplot(data = d, aes(x = number.of.NA)) +
      geom_histogram() +
      xlab("Number of NAs") +
      scale_x_discrete(breaks=min.miss:max.miss)
    return(g)
  }
}

Which makes a nice histogram of missing data by case with ggplot2. Almost. To see, try with some test data:

#make some test data
test.data = as.data.frame(iris)
set.seed(1)
which.remove = cbind(sample(1:150, 250, T),
                     sample(1:5, 250, T))
for (row in 1:nrow(which.remove)) {
  test.data[which.remove[row,1],which.remove[row,2]] = NA
}

#plot missing
miss.hist(test.data)

Which should give you this:

enter image description here

You see what is wrong. The right part of the plot is weirdly empty. Now you may think, this is easy to solve with setting the limits, i.e.: limits=c(min.miss, max.miss). But no, this fixes the problem, but removes the ticks!

enter image description here

Changing the order of them does not make a difference. How do I fix both problems?

Upvotes: 2

Views: 1290

Answers (1)

scoa
scoa

Reputation: 19867

You are using a discrete scale with an integer vector. Transform it to a factor instead

g = ggplot(data = d, aes(x = factor(number.of.NA,levels=as.character(seq(0,max.miss,1))))) +

Upvotes: 1

Related Questions