an.be
an.be

Reputation: 33

Plotting spectroscopic data using ggplot2

I am trying to plot spectroscopic data using ggplot2. I get my data in the following form:

data structure

My code so far is:

library(ggplot2)
library(reshape2)
melt_data <- melt(spectroscopic_data, id.vars = "sample_name", variable.name = "wavenumber", value.name = "intensity")
melt_data$probe = factor(melt_data$probe)
melt_data$wellenzahl = as.numeric(levels(melt_data$wellenzahl))[melt_data$wellenzahl]
ggplot(melt_data, aes(x=wavenumber, y=intensity, group=sample_name, color=sample_name)) + geom_line() +
scale_x_reverse(breaks=c(10000, 9500, 9000, 8500, 8000, 7500, 7000, 6500, 6000, 5500, 5000, 4500, 4000)) +
scale_color_manual(values=c("#FF0000", "#0000CC", "#00CC00", "#FF00FF", "#FF9900", "#000000", "#999900", "#33FFFF", "#FFCCFF", "#FFFF00", "#999999", "#9933FF", "#993300", "#99FF33")) + 
theme_bw() + 
theme(legend.position = "bottom") +
labs(x=expression(wavenumbers), y="intensity", colour = "") + 
theme(legend.text=element_text(size=10), axis.text=element_text(size=12), axis.title=element_text(size=14)) + 
guides(colour = guide_legend(ncol = 2, keywidth=1.5, keyheight=1, override.aes = list(size=1.8)))

I need the same color for aaa-samples, bbb-samples and so on (multiple measurements of one sample) but the plot does not work. I get a plot that looks like this when you zoom in:

zoom of current plot

It looks like ggplot2 connects two samples/lines of the same measurement instead of plotting them separately. Does anyone have an idea? I am trying to fix this since hours...

Thank you!

Upvotes: 2

Views: 808

Answers (2)

an.be
an.be

Reputation: 33

Here is my result after Luke C's awesome support:

library(ggplot2)
library(reshape2)

melted_data <- melt(newtestdata, id.vars = c("sample_name", 
"sample_id"), variable.name = "wavenumber", value.name = "intensity")

melted_data$wavenumber=as.numeric(levels(melted_data$wavenumber))[melted_data$wavenumber]

ggplot(melted_data, aes(x=wavenumber, y=intensity, group = sample_id, color = sample_name)) + geom_line() +
scale_x_reverse(breaks=c(1005, 1200, 1400), expand = c(0.01, 0.01)) +  
scale_y_continuous(breaks=c(0, 0.5, 1.0, 1.5, 2.0), expand = c(0.01, 0.01)) +

scale_color_manual(values=c("#FF0000", "#0000CC", "#00CC00", "#FF00FF", "#FF9900", "#000000")) +

theme_bw() + 
theme(legend.position = "bottom") + 
theme(plot.margin=unit(c(1,1,0.5,1),"cm")) +

labs(x=expression(wavenumbers~"in"~cm^{"-1"}), y="absorbance in a.u.", colour = "") + 
theme(legend.text=element_text(size=10), axis.text=element_text(size=12), axis.title=element_text(size=14)) + 
guides(colour = guide_legend(ncol = 3, keywidth=1.5, keyheight=1, override.aes = list(size=1.2)))

ggsave("buechi-all.pdf", width = 11.69, height = 8.27)

Data Structure

Result

Upvotes: 1

Luke C
Luke C

Reputation: 10336

One way is to add a sample id to your data frame before you reshape it. That will allow you to keep the names like "aaa" and "bbb" but assign a unique identifier to act as your grouping variable (since it cannot differentiate between two observations at the same x variable otherwise). For an example where I tried to mimic your input data:

ex<-cbind(c("aaa","aaa","bbb","bbb"), c(0.426,0.405,0.409,0.395), c(0.430,0.408,0.411,0.399), c(0.432,0.411,0.413,0.401))

ex<- as.data.frame(ex)

colnames(ex) <- c("sample_name", "4000", "4004", "4008")

ex$sample_id<-1:nrow(ex)

melt <- melt(ex, id.vars = c("sample_name", "sample_id"), variable.name = "wavenumber", value.name = "intensity")

ggplot(melt, aes(x = wavenumber, y = intensity, group = sample_id, color = sample_name)) +
  geom_line() +
  theme_classic()

This outputs separate lines for different measurements of samples grouped by sample id, but keeping the color according to the sample name:

enter image description here

Is that sort of what you're after?


Edits below

To show the same approach with a larger dataset:

alpha <-rep(sapply(letters[1:10], function(x) {paste(x,x,x, sep = "")}), each = 2)
adf <- data.frame(alpha)
adf$sample_id <- seq(1, (length(alpha)))
adf$t <- rnorm(20, 0.4, 0.1)

wavenum <- seq(4, 1503)
for(i in wavenum){
  for(j in 1:length(alpha)){
    adf[j,i] <- adf[j,i-1] + (rnorm(1, 0.01, 0.01))
  }
}
adf[1:10, 1:10]

anames <- c("sample_name", "sample_id", (1400 + 4 * seq(0, 1500)))
names(adf)<-anames

melt <- melt(adf, id.vars = c("sample_name", "sample_id"), variable.name = "wavenumber", value.name = "intensity")

head(melt)

ggplot(melt[1:1500,], aes(x = wavenumber, y = intensity, group = sample_id, color = sample_name)) +
  geom_line(lwd = 1.5) +
  theme_classic()

This will give a similar plot to the one above, where each sample has an individual line for each measurement that are both the same color.

enter image description here

If I'm still missing what you're actually after, I apologize!

Upvotes: 0

Related Questions