akamini
akamini

Reputation: 13

R geom_bar not aligning with X axis

I have 2 tables of data to plot which are structurally identical, but the resulting plots are quite different using geom_bar() in ggplot (R 4.1.0, tidyverse). Below, in Plot 1 (generated from t1) the bar arrangement on the X axis overlaps adjacent values on the scale, whereas in Plot 2 (generated from t2) they align as I would expect. I cannot figure out why this would be - can anyone spot the problem?

Thanks

Table 1:

t1 <- structure(list(Year = c(2018, 2018, 2018, 2021, 2021, 2021), 
        Area = c("1a", "1b", "2", "1a", "1b", "2"), Value = c(131.333333333333, 
        75, 90, 178.2, 247.444444444444, 152.2)), row.names = c(NA, 
    -6L), groups = structure(list(Year = c(2018, 2018, 2018, 2021, 
    2021, 2021), Area = c("1a", "1b", "2", "1a", "1b", "2"), .rows = structure(list(
        1L, 2L, 3L, 4L, 5L, 6L), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), row.names = c(NA, -6L), class = c("tbl_df", 
    "tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
    "tbl_df", "tbl", "data.frame"))

Table 2:

t2 <- structure(list(Year = c(2018, 2018, 2018, 2019, 2019, 2019, 2021, 
    2021, 2021), Area = c("1a", "1b", "2", "1a", "1b", "2", "1a", 
    "1b", "2"), Value = c(331.764705882353, 306.666666666667, 229.538461538462, 
    274.235294117647, 268.846153846154, 209.642857142857, 374.058823529412, 
    333.833333333333, 189.545454545455)), row.names = c(NA, -9L), groups = structure(list(
        Year = c(2018, 2018, 2018, 2019, 2019, 2019, 2021, 2021, 
        2021), Area = c("1a", "1b", "2", "1a", "1b", "2", "1a", "1b", 
        "2"), .rows = structure(list(1L, 2L, 3L, 4L, 5L, 6L, 7L, 
            8L, 9L), ptype = integer(0), class = c("vctrs_list_of", 
        "vctrs_vctr", "list"))), row.names = c(NA, -9L), class = c("tbl_df", 
    "tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
    "tbl_df", "tbl", "data.frame"))

Plot code (swap t1/t2):

p <- ggplot(t1,aes(x=Year,y=Value,fill=Area)) +
    geom_bar(position=position_dodge(.9),stat="identity")

enter image description here

enter image description here

Upvotes: 1

Views: 1355

Answers (1)

Limey
Limey

Reputation: 12461

From the online doc:

Dodging preserves the vertical position of an geom while adjusting the horizontal position. position_dodge() requires the grouping variable to be be specified in the global or geom_* layer. Unlike position_dodge(), position_dodge2() works without a grouping variable in a layer. position_dodge2() works with bars and rectangles, but is particulary useful for arranging box plots, which can have variable widths.

So,

ggplot(t1,aes(x=Year,y=Value,fill=Area)) +
  geom_bar(position=position_dodge2(.9),stat="identity")

enter image description here

Edit in response to OP's comment below

ggplot() can only plot data that it knows about. You only have data for 2018 and 2021 in t1, so that's all it can plot. The overlap is caused by the interaction of this "missing" data, the fact that Year is a continuous variable, and the other aesthetics in the plot.

A simple solution is to put the missing data back. To do this, I need to ungroup the sample data because you can't add rows to a grouped data frame. But that's not a problem. The grouping is irrelevant here.

t1 %>% 
  ungroup() %>% 
  add_row(Year=2019, Area="1a", Value=0) %>% 
  add_row(Year=2019, Area="1b", Value=0) %>% 
  add_row(Year=2019, Area="2", Value=0) %>% 
  ggplot(aes(x=Year,y=Value,fill=Area)) +
  geom_bar(position=position_dodge(.9),stat="identity")

enter image description here

Alternatively, convert year to a factor:

t1 %>% mutate(Year=factor(Year, levels=2018:2021))%>% 
  ggplot(aes(x=Year,y=Value,fill=Area)) +
  geom_bar(position=position_dodge(.9),stat="identity")

enter image description here

It never hurts to look at your data from time to time. Thinking about your data is even better.

Upvotes: 1

Related Questions