Reputation: 179
I would like to reproduce this figure from a recent publication in R, but I am unsure how.
The plot's idea is simple. At the top is a representation of a full length virus sequence and each line underneath it depicts a sequenced isolate.
For each sequence, there are two pieces of information:
Where the sequence starts, ends, deleted, etc For instance: sequence 1, starts at position 1 and ends at position 9000 But sequence 2, starts at position 1, ends at 2000, in the middle everything is deleted and then starts again at 8000-9000
Color-coded based on whether it is deleted, full-length, etc
Initially I thought I could use a bar graph, like this one I drew in Illustrator, where the x would be essentially each sequence line and the y would be the coordinates of where it maps. But I am not sure if this will allow me to designate "gaps", as seen in sequence 3 in my illustrator picture.
The data itself is organized as such:
Sequence name Mapped Start Mapped End
1 1 9000
2 4000 9000
3 1 2000
3 7000 9000
The data set only includes the mapped start and end positions, not the deleted positions.
Would appreciate hearing your input!
Thanks
Upvotes: 1
Views: 762
Reputation: 721
I might recommend using a series of geoms, one for each sequence. If you have your data organized in a certain way, it would be fairly straightforward. For example, if your data is in long format as follows:
dat <- data.frame(sequence=c(1,2,2,2), start=c(1,1,2001,8000), stop=c(9000,2000,7999,9000), type=c("mapped","mapped","deletion","mapped"))
Which looks like...
sequence start stop type
1 1 9000 mapped
2 1 2000 mapped
2 2001 7999 deletion
2 8000 9000 mapped
You could do the following:
library(ggplot2)
g <- ggplot(data=dat, mapping=aes(ymin=0, ymax=1, xmin=start, xmax=stop, fill=type)) +
geom_rect() + facet_grid(sequence~., switch="y") +
labs(x="Position (BP)", y="Sequence / Strain", title="Mapped regions for all sequences") +
theme(axis.text.y=element_blank(), axis.ticks.y=element_blank()) +
theme(plot.title = element_text(hjust = 0.5))
Upvotes: 2