Plotting spectroscopic data using ggplot2

Question

I am trying to plot spectroscopic data using ggplot2. I get my data in the following form:

data structure

My code so far is:

library(ggplot2)
library(reshape2)
melt_data <- melt(spectroscopic_data, id.vars = "sample_name", variable.name = "wavenumber", value.name = "intensity")
melt_data$probe = factor(melt_data$probe)
melt_data$wellenzahl = as.numeric(levels(melt_data$wellenzahl))[melt_data$wellenzahl]
ggplot(melt_data, aes(x=wavenumber, y=intensity, group=sample_name, color=sample_name)) + geom_line() +
scale_x_reverse(breaks=c(10000, 9500, 9000, 8500, 8000, 7500, 7000, 6500, 6000, 5500, 5000, 4500, 4000)) +
scale_color_manual(values=c("#FF0000", "#0000CC", "#00CC00", "#FF00FF", "#FF9900", "#000000", "#999900", "#33FFFF", "#FFCCFF", "#FFFF00", "#999999", "#9933FF", "#993300", "#99FF33")) + 
theme_bw() + 
theme(legend.position = "bottom") +
labs(x=expression(wavenumbers), y="intensity", colour = "") + 
theme(legend.text=element_text(size=10), axis.text=element_text(size=12), axis.title=element_text(size=14)) + 
guides(colour = guide_legend(ncol = 2, keywidth=1.5, keyheight=1, override.aes = list(size=1.8)))

I need the same color for aaa-samples, bbb-samples and so on (multiple measurements of one sample) but the plot does not work. I get a plot that looks like this when you zoom in:

zoom of current plot

It looks like ggplot2 connects two samples/lines of the same measurement instead of plotting them separately. Does anyone have an idea? I am trying to fix this since hours...

Thank you!

Luke C · Accepted Answer

One way is to add a sample id to your data frame before you reshape it. That will allow you to keep the names like "aaa" and "bbb" but assign a unique identifier to act as your grouping variable (since it cannot differentiate between two observations at the same x variable otherwise). For an example where I tried to mimic your input data:

ex<-cbind(c("aaa","aaa","bbb","bbb"), c(0.426,0.405,0.409,0.395), c(0.430,0.408,0.411,0.399), c(0.432,0.411,0.413,0.401))

ex<- as.data.frame(ex)

colnames(ex) <- c("sample_name", "4000", "4004", "4008")

ex$sample_id<-1:nrow(ex)

melt <- melt(ex, id.vars = c("sample_name", "sample_id"), variable.name = "wavenumber", value.name = "intensity")

ggplot(melt, aes(x = wavenumber, y = intensity, group = sample_id, color = sample_name)) +
  geom_line() +
  theme_classic()

This outputs separate lines for different measurements of samples grouped by sample id, but keeping the color according to the sample name:

Is that sort of what you're after?

Edits below

To show the same approach with a larger dataset:

alpha <-rep(sapply(letters[1:10], function(x) {paste(x,x,x, sep = "")}), each = 2)
adf <- data.frame(alpha)
adf$sample_id <- seq(1, (length(alpha)))
adf$t <- rnorm(20, 0.4, 0.1)

wavenum <- seq(4, 1503)
for(i in wavenum){
  for(j in 1:length(alpha)){
    adf[j,i] <- adf[j,i-1] + (rnorm(1, 0.01, 0.01))
  }
}
adf[1:10, 1:10]

anames <- c("sample_name", "sample_id", (1400 + 4 * seq(0, 1500)))
names(adf)<-anames

melt <- melt(adf, id.vars = c("sample_name", "sample_id"), variable.name = "wavenumber", value.name = "intensity")

head(melt)

ggplot(melt[1:1500,], aes(x = wavenumber, y = intensity, group = sample_id, color = sample_name)) +
  geom_line(lwd = 1.5) +
  theme_classic()

This will give a similar plot to the one above, where each sample has an individual line for each measurement that are both the same color.

If I'm still missing what you're actually after, I apologize!

Plotting spectroscopic data using ggplot2

Answers (2)

Related Questions