Reputation: 11
I have a large dataset that I'd like to use to plot genetic divergence along chromosomes. The data frame I am using has the following format.
ID Group 100 270 310 430 460 550 580 660 710 740
Strain1 A 0.191 0.147 0.124 0.149 0.193 0.189 0.123 0.189 0.151 0.180
Strain2 A 0.188 0.188 0.149 0.136 0.000 0.199 0.199 0.188 0.149 0.000
Strain3 B 0.123 0.147 0.190 0.061 0.148 0.149 0.148 0.197 0.178 0.172
Strain4 B 0.147 0.197 0.188 0.178 0.179 0.149 0.191 0.154 0.179 0.187
I'd like to use ggplot2 to plot a line for each strain, with the lines colored according to group affiliation, and a continuous x-axis running from chromosome positions 100 through 740. I cannot figure out how to melt the data without extracting the group info first and then adding it back after melting. Can anyone suggest a one-step approach to accomplish this?
Upvotes: 1
Views: 791
Reputation: 11
The answer by akrun is almost there, except there should be one line plotted for each strain. For more information, here's a link to a screen shot (sorry, need more rep for posting actual image) of a SHINY app I'm working on that plots chromosome similarity between a selected fungal strain and a collection of other strains that infect different host grass species. Shiny App plot The current plot shows genetic divergence between strain 87-120 plotted against 10 rice (Oryza)-infecting strains (colored in red), 7 St. Augustinegrass (Stenotaphrum)-infecting strains (in dark blue) and 8 finger millet (Eleusine)-infecting strains (light blue). My current problem is that the x-axis values do not represent chromosome positions (instead it's the analysis window number) and I need to melt (or gather) data frame fields in a way that I can use the chromosome position information that is in the headers for the x-axis, and the Group information for the color.
Upvotes: 0
Reputation: 33782
I think this will work best if you colour by Group and facet on Strain. Assuming dataframe is named mydata
:
library(tidyr)
library(ggplot2)
mydata %>%
gather(Var, Val, -Group, -ID) %>%
ggplot(aes(Var, Val)) +
geom_line(aes(color = Group, group = Group)) +
facet_grid(ID ~ .)
Upvotes: 1
Reputation: 887148
We could gather
into 'long' format and then plot with ggplot
library(ggplot2)
library(dplyr)
library(tidyr)
gather(df1, key, val, 3:ncol(df1)) %>%
mutate(key = as.numeric(key)) %>%
ggplot() +
geom_line(aes(x = key, y = val, group = Group, color = Group))
Upvotes: 1