Canna
Canna

Reputation: 113

Sort dataset for grouped boxplot

I have a rather untidy dataset and can't wrap my head around how to do this in R. Alternative would be to do this in Excel but since I have several of these, this would take forever.

So what I need is to create a grouped boxplot. For this I think I need a dataset that consists of 4 columns:
species, group (A or B), variable, value.
But what I have at the moment is only:
variable and species_group (together in one column),

Here is a reproducible example:

variable <- c('precipitation','soil','land use')
species1_A <- c(10000, 500, 1322)
species1_B <- c(11500, 200, 600)
species2_A <- c(10000, 500, 1489)
species2_B <- c(15687, 800, 587)
df <- data.frame(variable, species1_A, species1_B,species2_A, species2_B)

So I guess I have to create a whole new column "group" with A or B and somehow tell R to take that information from the "species1_A" name.

Can anyone help me please? Thank you!

Upvotes: 0

Views: 71

Answers (1)

Will Oldham
Will Oldham

Reputation: 1054

I'd suggest the following:

library(tidyverse)

df %>%
  pivot_longer(contains("species"), names_to = "name", values_to = "value") %>% 
  separate(name, c("species", "group"), "_") %>% 
  ggplot() +
  facet_wrap(~variable) +
  aes(x = species, y = value, color = group) +
  geom_point()

Sorry I'm not sure how you'd want things laid out and you only have one value per group in your example dataset. You can change geom_point to geom_boxplot once you have more variables per group. Spacing between the boxes can be adjusted with position_dodge. HTH.

Upvotes: 1

Related Questions