nklsstll
nklsstll

Reputation: 25

ggplot2: grouping a barplot by several instead of a single categorical variable

I have a data.frame that looks like this:

df <- data.frame(mean_swd = c(4.0000, 5.3333, 6.3333, 5.6666, 3.6666),
             afd_pot = c(0, 1, 0, 0, 1),
             union_pot = c(0, 1, 1, 1, 1),
             spd_pot = c(0, 1, 0, 0, 1),
             fdp_pot = c(0, 1, 1, 0, 0),
             green_pot = c(0, 1, 0, 1, 1),
             linke_pot = c(1, 0, 1, 1, 1))

> df
  mean_swd afd_pot union_pot spd_pot fdp_pot green_pot linke_pot
1   4.0000       0         0       0       0         0         1
2   5.3333       1         1       1       1         1         0
3   6.3333       0         1       0       1         0         1
4   5.6666       0         1       0       0         1         1
5   3.6666       1         1       1       0         1         1

The pot variables represent a potential (1) or no potential (0) to vote for a party, mean_swd stands for a mean score on an attitude scale (from 1-7), the rows represent individuals.

I want produce a grouped barplot using ggplot2 that actually puts several barplots into one plot. It should plot the mean of mean_swd against the 6 pot variables separately, so that I can compare the mean scores on mean_swd for the individual groups of persons for which ..._pot == 1 (additionally, but not necessarily, grouping by the levels of these variables (1/0), so that I can compare mean_swd between those that have a potential of voting for that party vs those that don't).

As I don't have a single categorical variable by which to group I can't figure out how to code this and haven't found any solutions to the problem. The grouping solutions I found all work with single categorical variables for grouping. But I can't transform these six variables into one, as these potentials are not exclusive. The seperate barplots thus need to be calculated with varying individual observations. I also thought about grouping by boolean expressions but couldn't find any sources for this.

Any suggestions? Thank you in advance. Also feel free to criticize the presentation of my problem, as this is my first posting ever.

Upvotes: 0

Views: 270

Answers (2)

Anonymous coward
Anonymous coward

Reputation: 2091

Is this what you are after? Feel free to clarify. I'm not sure if you'd rather have one that counts 1s and 0s and plots that against the average though.

df <- data.frame(mean_swd = c(4.0000, 5.3333, 6.3333, 5.6666, 3.6666),
                 afd_pot = c(0, 1, 0, 0, 1),
                 union_pot = c(0, 1, 1, 1, 1),
                 spd_pot = c(0, 1, 0, 0, 1),
                 fdp_pot = c(0, 1, 1, 0, 0),
                 green_pot = c(0, 1, 0, 1, 1),
                 linke_pot = c(1, 0, 1, 1, 1),
                 Group = c(1,2,3,4,5))
df1 <- gather(df, key = variables, value = value, mean_swd:linke_pot)
ggplot(df1, aes(x = variables, y = value, fill = factor(Group))) +
  facet_wrap(~Group) +
  geom_bar(stat = "identity", color = "black", position = position_dodge()) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) +
  labs(fill = "Groups")

enter image description here

Upvotes: 0

remek
remek

Reputation: 943

Welcome to stackoverflow!

Are you looking for something like this? Is this going in the right direction?

library(magrittr)
library(dplyr)
library(reshape2)
library(ggplot2)

df <- data.frame(mean_swd = c(4.0000, 5.3333, 6.3333, 5.6666, 3.6666),
                 afd_pot = c(0, 1, 0, 0, 1),
                 union_pot = c(0, 1, 1, 1, 1),
                 spd_pot = c(0, 1, 0, 0, 1),
                 fdp_pot = c(0, 1, 1, 0, 0),
                 green_pot = c(0, 1, 0, 1, 1),
                 linke_pot = c(1, 0, 1, 1, 1))

dat <- df %>%
  melt(id.vars = "mean_swd") %>% 
  group_by(variable, value) %>%
  summarise(mean = mean(mean_swd))

dat$value %<>% as.factor()

ggplot(dat, aes(variable, mean, fill = value)) + geom_col()

enter image description here

Upvotes: 1

Related Questions