alexvc
alexvc

Reputation: 67

Sum multiple distinct columns and plot on the same ggplot (without grouping variables)

I have a dataset that looks like this

library(dplyr)
df=data.frame(
  x1=c(0,1,1,0,0,0,0,1,0),
  x2=c(0,0,0,0,1,0,0,1,0),
  x3=c(1,1,1,0,0,0,1,0,0),
  x4=c(0,0,0,0,0,1,1,0,1)
  

Each variable (x1-x5) corresponds to a different survey questions with True/False values (1,0). I am trying to plot the frequencies of 1s for each variable on the same simple bar plot. All solutions I've seen so far have a grouping variable of some sort, but that's not the case here.

Ideally I want a plot that has 4 simple values - the frequencies of 1s in each column.

Bonus points - I'm trying to parameterize this process by creating a vector with the column names.

So I have a variable that is
cols <- c("x1", "x2", "x3", "x4")

Upvotes: 0

Views: 240

Answers (1)

stefan
stefan

Reputation: 123783

This could be achieved by converting your dataframe to long format using e.g. tidyr::pivot_longer whereby making use of dplyr::select you could first select your desired columns:

library(tidyr)
library(ggplot2)
library(dplyr)

df=data.frame(
  x1=c(0,1,1,0,0,0,0,1,0),
  x2=c(0,0,0,0,1,0,0,1,0),
  x3=c(1,1,1,0,0,0,1,0,0),
  x4=c(0,0,0,0,0,1,1,0,1))

cols <- c("x1", "x2", "x4")

df_long <- df %>% 
  select(all_of(cols)) %>% 
  pivot_longer(cols = everything(), names_to = "name", values_to = "value") 
df_long
#> # A tibble: 27 x 2
#>    name  value
#>    <chr> <dbl>
#>  1 x1        0
#>  2 x2        0
#>  3 x4        0
#>  4 x1        1
#>  5 x2        0
#>  6 x4        0
#>  7 x1        1
#>  8 x2        0
#>  9 x4        0
#> 10 x1        0
#> # … with 17 more rows
ggplot(df_long, aes(name, value)) +
  geom_col()

Upvotes: 1

Related Questions