user7264
user7264

Reputation: 123

ggplot a dataframe with a matrix variable?

I am trying to replicate this plot using ggplot

pacman::p_load(tidyverse, pls, remotes)
install_github("rwehrens/ChemometricsWithR")

data(gasoline)
wavelengths <- seq(900,1700, 2)
matplot(wavelengths, t(gasoline$NIR), type = "l", lty = 1, xlab = "Wavelength (nm)", ylab = "1/R")

pic matplot but cannot seem to make it work. The gasoline dataset is a tricky one: one of the two variables is a matrix which I have never encountered before. How can I clean up this dataset to make it tidy? I tried the following:

gasoline2 <- as.data.frame(as.matrix(gasoline)) %>% 
  pivot_longer(cols = -c(octane),
              names_to = "wavelength",
              values_to = "1/R") 

but cannot seem to This code:

ggplot(gasoline, mapping = aes(x = wavelengths, y = t(gasoline$NIR)))+
  geom_line(mapping = aes(color = octane))

is returning this error:

Error in `geom_line()`:
! Problem while computing aesthetics.
ā„¹ Error occurred in the 1st layer.
Caused by error in `check_aesthetics()`:
! Aesthetics must be either length 1 or the same as the data (60)
āœ– Fix the following mappings: `x` and `y`
Backtrace:
  1. base (local) `<fn>`(x)
  2. ggplot2:::print.ggplot(x)
  4. ggplot2:::ggplot_build.ggplot(x)
  5. ggplot2:::by_layer(...)
 12. ggplot2 (local) f(l = layers[[i]], d = data[[i]])
 13. l$compute_aesthetics(d, plot)
 14. ggplot2 (local) compute_aesthetics(..., self = self)
 15. ggplot2:::check_aesthetics(evaled, n)

Upvotes: 0

Views: 94

Answers (3)

user7264
user7264

Reputation: 123

pacman::p_load(tidyverse, pls, remotes)
install_github("rwehrens/ChemometricsWithR")

gasoline2 <- as.data.frame(as.matrix(gasoline)) %>% 
  pivot_longer(cols = -c(octane),
              names_to = "wavelength",
              values_to = "1/R") %>% 
  mutate(wavelength = str_remove_all(wavelength, "[^[:digit:]]"))

ggplot(gasoline2, mapping = aes(x = wavelength, y = `1/R`))+
  geom_line(mapping = aes(color = octane))

ggplot plot

** However, I can not figure out the scale_color_gradient syntax to match the colors

Edit: Thanks to the previous two posters here is my final replica!

gasoline2 <- as.data.frame(as.matrix(gasoline)) %>% 
  mutate(group = row_number()) %>%
  relocate(group, .after = octane) %>% 
  pivot_longer(cols = -c(octane, group),
              names_to = "wavelength",
              values_to = "spectra") %>% 
  mutate(wavelength = as.numeric(str_remove_all(wavelength, "[^[:digit:]]"))) %>% 
  mutate(octane = as.factor(octane))

ggplot(gasoline2, mapping = aes(x = wavelength, y = `spectra`, color = octane, group = group))+
  geom_line(linewidth = 0.5)+
  scale_color_discrete(guide = "none")+
  xlab(label = "\nWavelength (nm)")+
  ylab(label = "1/R\n")+
  theme_classic()+
  scale_y_continuous(n.breaks = 7)+
  scale_x_continuous(breaks = seq(1000,1600,200))

final pic

Upvotes: 1

Esther
Esther

Reputation: 436

The answer that @user7264 has provided is correct, although there's a little mistake when adding the new variable wavelength. It should be a numerical variable, instead of a character. Hence, considering this response and the colour matter, the answer that I propose is the following one:

pacman::p_load(tidyverse, pls, remotes)
install_github("rwehrens/ChemometricsWithR")

gasoline2 <- as.data.frame(as.matrix(gasoline)) %>% 
  pivot_longer(cols = -c(octane),
               names_to = "wavelength",
               values_to = "1/R") %>% 
  # From the above code of @user7264, add as.numeric()
  mutate(wavelength = as.numeric(str_remove_all(wavelength, "[^[:digit:]]")))

ggplot(gasoline2, mapping = aes(x = wavelength, y = `1/R`)) +
  geom_line(mapping = aes(color = octane)) +
  scale_x_continuous(breaks = seq(1000, 2000, 200)) +
  scale_colour_continuous(type = "viridis") +
  theme_bw()

which results in this plot.

resulting plot

I hope this is useful! šŸ˜ƒ

Upvotes: 2

anon
anon

Reputation:

This is pretty close. If you care about the specific colors, you could use scale_color_manual() or some other scale_color_*() function.


#give every octane a unique ID for grouping later on
gasoline <- 
  gasoline |> 
  mutate(group = 1:n())

#colbind matrix-column as a dataframe
gasoline2 <- 
  bind_cols(
    gasoline |> select(octane, group),
    gasoline |> pull(NIR)
  ) |> 
  # convert colnames to numeric wavelengths
  pivot_longer(
    cols = c(-octane, -group),
    names_to = "wavelength",
    values_to = "1/R",
    names_pattern = "(\\d+)",
    names_transform = as.numeric
  ) |> 
  # octane as factor for line colors
  mutate(octane = as.factor(octane)) 


ggplot(gasoline2,
    #group aesthetic to plot separate lines for repeat values of octane
       aes(x = wavelength, y = `1/R`, color = octane, group = group)) +
  geom_line(size = .7) +
  scale_color_discrete(guide = "none") +
  theme_classic()

enter image description here

Upvotes: 1

Related Questions