Greg T.D.
Greg T.D.

Reputation: 49

Make a plot using one column as start and another as stop for boxes

I have a .bed file that details where specific genes are in a genome and I'd like to plot boxes depicting these genes over an x axis depicting the chromosomes they're on. While there seem to be a lot of tools for doing this in various ways, they all look pretty ugly so if possible i'd like to do it with ggplot where I have more control over the final product's look. The file is formatted in four columns as:

chromosomeNAME geneStart geneEnd geneName ChrmLength
         chrm1   3714014 3735354    geneA    6509629
         chrm1   4130851 4178170    geneB    6509629
         chrm2    264426  307752    geneC    5196352
         chrm2    334381  382612    geneD    5196352

So i'd like the above to put out two x axis one for chromosome one and one for chromosome two that go from 1-chrmlength and then have boxes above the axis that are filled in from geneStart to geneEnd. Ideally these boxes would be color coded in a legend.

I know how to facet the graph but I'm not sure how to get ggplot to output a filled-in box using the start/end coordinates as depicted above.

thanks!

Upvotes: 0

Views: 257

Answers (1)

olorcain
olorcain

Reputation: 1248

Maybe something like this?

enter image description here

If that's useful, my code is as follows.

library(scales)
library(dplyr)
library(ggplot2)
chromosomeNAME <- c("chrm1", "chrm1", "chrm2", "chrm2")
geneStart <- c(3714014, 4130851, 264426, 334381)
geneEnd <- c(3735354, 4178170, 307752, 382612)
geneName <- c("geneA", "geneB", "geneC", "geneD")
ChrmLength <- c(6509629, 6509629, 5196352, 5196352)
df <- data.frame(chromosomeNAME, geneStart, geneEnd, geneName, ChrmLength)

chromosomes <- df %>%
  select(chromosomeNAME, ChrmLength) %>%
  unique() %>%
  mutate(ChrmStart = 0)

genes <- df %>%
  select(chromosomeNAME, geneName, geneStart, geneEnd)

ggplot(chromosomes) +
  geom_rect(mapping=aes(xmin=ChrmStart, xmax=ChrmLength, ymin=0, ymax=1), fill="lightblue") +
  geom_rect(genes, mapping=aes(xmin=geneStart, xmax=geneEnd, ymin=0, ymax=1, fill=geneName)) +
  theme_light() +
  theme(axis.title.y=element_blank(),
        axis.text.y=element_blank(),
        axis.ticks.y=element_blank()) +
  scale_x_continuous(labels = comma) +
  facet_grid(chromosomeNAME~.)

Upvotes: 2

Related Questions