Reputation: 33
I have a corpus of linguistic data in which I'm analyzing the frequency at which individuals of different linguistic proficiencies confuse two words with similar meanings (which I'll be calling "foo" and "bar"). Below is a long table of the exact same format as my real data. It contains a summary of the frequency of use and misuse of both words (freq
) for each level of proficiency represented in the corpus (rating
). The variable type
encodes the word produced and the word intended, separated by a period. type1
is extracted from type
to give only the word produced.
## sample data "my.data":
my.data <- structure(list(rating = c(4L, 6L, 3L, 1L, 2L, 4L, 6L, 3L, 1L,
2L, 4L, 6L, 3L, 1L, 2L, 4L, 6L, 3L, 1L, 2L), type = c("foo.bar",
"foo.bar", "foo.bar", "foo.bar", "foo.bar", "bar.foo", "bar.foo",
"bar.foo", "bar.foo", "bar.foo", "foo.foo", "foo.foo", "foo.foo",
"foo.foo", "foo.foo", "bar.bar", "bar.bar", "bar.bar", "bar.bar",
"bar.bar"), type1 = c("foo", "foo", "foo", "foo", "foo", "bar",
"bar", "bar", "bar", "bar", "foo", "foo", "foo", "foo", "foo",
"bar", "bar", "bar", "bar", "bar"), freq = c(4e-04, 0, 0.002,
0, 0, 3e-04, 0, 0.001, 0, 0, 0.002, 6e-04, 0.004, 0.003, 0.002,
0.001, 0.001, 0.002, 0.01, 0.008)), class = "data.frame", row.names = c(NA,
20L))
> head(my.data)
rating type type1 freq
1 4 foo.bar foo 4e-04
2 6 foo.bar foo 0e+00
3 3 foo.bar foo 2e-03
4 1 foo.bar foo 0e+00
5 2 foo.bar foo 0e+00
6 4 bar.foo bar 3e-04
This is my current attempt at visualizing the above data:
The combination of side-by-side columns and stacking is in order to delineate the words produced so that the frequencies at which "bar" and "foo" are used correctly and incorrectly are easily comparable (the trend is much harder to see if each rating has only one column stacking all 4 possibilities). In order to accomplish this, I had to use type1
as x in the aesthetic and make the ratings (the actual independent variable) facets:
my.plot <- ggplot(my.data) +
geom_col(aes(x=type1, y=freq, fill=type)) +
facet_grid(~rating, switch='both') +
scale_fill_discrete(name='Actual form / target form', labels=c(foo.bar='*foo / bar', bar.foo='*bar / foo', foo.foo='foo / foo', bar.bar='bar / bar')) +
scale_y_continuous(expand=c(0, 0)) + # to accommodate the hline
geom_hline(yintercept=0, linewidth=1) + # attempt to reintroduce the x-axis line
labs(x='Rating', y='Frequency') +
my.usual.ggtheme + # does not contain anything related to facet -- I can share if necessary, but it's quite long and I don't want to waste space here
theme(
strip.background=element_blank(),
strip.text.x=element_text(family='DejaVu Serif', colour='#000000', size=16),
axis.text.x=element_blank()
)
The issue is that the facet labels seem to cover or erase the x-axis, which I nearly fixed with the geom_hline()
, but this line is still interrupted in an undesirable way. The horizontal lines under the facet labels are also problematic, as they break the consistent style of the graphs in this project (none of the others use facets like this and thus don't have the same problem).
It should look basically like this (edited with Inkscape):
I've seen non-facet solutions to produce similar graphs, like this, but they're sort of clunky, so I'd like to avoid redoing the graph entirely with a solution like that. Is there something that I can simply do to the theme or the facet_grid()
call to get rid of the lines beneath the facet labels and, ideally, make the x-axis line unbroken?
Upvotes: 1
Views: 37
Reputation: 33
In addition to I_O's solution to the spacing problem, I realized that the horizontal lines seen below the facet labels do not come from the facet itself, but are in fact the original x axis, which the facet was drawing strips inside of instead of outside. The extraneous lines are fixed by adding strip.placement='outside'
to the theme; the x axis then looks correct once combined with I_O's answer. So, to generate the correct plot, the code changes to:
my.plot <- ggplot(my.data) +
geom_col(aes(x=type1, y=freq, fill=type)) +
facet_grid(~rating, switch='both') +
scale_fill_discrete(name='Actual form / target form', labels=c(foo.bar='*foo / bar', bar.foo='*bar / foo', foo.foo='foo / foo', bar.bar='bar / bar')) +
scale_y_continuous(expand=c(0, 0)) +
labs(x='Rating', y='Frequency') +
my.usual.ggtheme +
theme(
strip.background=element_blank(),
strip.text.x=element_text(family='DejaVu Serif', colour='#000000', size=16),
axis.text.x=element_blank(),
panel.spacing=unit(0, 'lines'),
axis.ticks.x=element_blank(),
strip.placement='outside'
)
Upvotes: 1
Reputation: 6911
you could remove the facet gaps and x-axis ticks by adding this to your theme:
## ggplot components +
theme(
panel.spacing = unit(0, "lines"),
axis.ticks.x = element_blank(),
## existing specifications
)
Upvotes: 2