Reputation: 13
I have 2 tables of data to plot which are structurally identical, but the resulting plots are quite different using geom_bar() in ggplot (R 4.1.0, tidyverse). Below, in Plot 1 (generated from t1) the bar arrangement on the X axis overlaps adjacent values on the scale, whereas in Plot 2 (generated from t2) they align as I would expect. I cannot figure out why this would be - can anyone spot the problem?
Thanks
Table 1:
t1 <- structure(list(Year = c(2018, 2018, 2018, 2021, 2021, 2021),
Area = c("1a", "1b", "2", "1a", "1b", "2"), Value = c(131.333333333333,
75, 90, 178.2, 247.444444444444, 152.2)), row.names = c(NA,
-6L), groups = structure(list(Year = c(2018, 2018, 2018, 2021,
2021, 2021), Area = c("1a", "1b", "2", "1a", "1b", "2"), .rows = structure(list(
1L, 2L, 3L, 4L, 5L, 6L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, -6L), class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
Table 2:
t2 <- structure(list(Year = c(2018, 2018, 2018, 2019, 2019, 2019, 2021,
2021, 2021), Area = c("1a", "1b", "2", "1a", "1b", "2", "1a",
"1b", "2"), Value = c(331.764705882353, 306.666666666667, 229.538461538462,
274.235294117647, 268.846153846154, 209.642857142857, 374.058823529412,
333.833333333333, 189.545454545455)), row.names = c(NA, -9L), groups = structure(list(
Year = c(2018, 2018, 2018, 2019, 2019, 2019, 2021, 2021,
2021), Area = c("1a", "1b", "2", "1a", "1b", "2", "1a", "1b",
"2"), .rows = structure(list(1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 9L), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), row.names = c(NA, -9L), class = c("tbl_df",
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
Plot code (swap t1/t2):
p <- ggplot(t1,aes(x=Year,y=Value,fill=Area)) +
geom_bar(position=position_dodge(.9),stat="identity")
Upvotes: 1
Views: 1355
Reputation: 12461
From the online doc:
Dodging preserves the vertical position of an geom while adjusting the horizontal position. position_dodge() requires the grouping variable to be be specified in the global or geom_* layer. Unlike position_dodge(), position_dodge2() works without a grouping variable in a layer. position_dodge2() works with bars and rectangles, but is particulary useful for arranging box plots, which can have variable widths.
So,
ggplot(t1,aes(x=Year,y=Value,fill=Area)) +
geom_bar(position=position_dodge2(.9),stat="identity")
Edit in response to OP's comment below
ggplot()
can only plot data that it knows about. You only have data for 2018 and 2021 in t1
, so that's all it can plot. The overlap is caused by the interaction of this "missing" data, the fact that Year
is a continuous variable, and the other aesthetics in the plot.
A simple solution is to put the missing data back. To do this, I need to ungroup
the sample data because you can't add rows to a group
ed data frame. But that's not a problem. The grouping is irrelevant here.
t1 %>%
ungroup() %>%
add_row(Year=2019, Area="1a", Value=0) %>%
add_row(Year=2019, Area="1b", Value=0) %>%
add_row(Year=2019, Area="2", Value=0) %>%
ggplot(aes(x=Year,y=Value,fill=Area)) +
geom_bar(position=position_dodge(.9),stat="identity")
Alternatively, convert year to a factor:
t1 %>% mutate(Year=factor(Year, levels=2018:2021))%>%
ggplot(aes(x=Year,y=Value,fill=Area)) +
geom_bar(position=position_dodge(.9),stat="identity")
It never hurts to look at your data from time to time. Thinking about your data is even better.
Upvotes: 1