James
James

Reputation: 69

Create a multiline plot from a dataset with time on one axis and genes on the other

I have a dataset with mean gene counts for each decade as shown below:

structure(list(decade_0 = c(92.500989948184, 2788.27384875413, 
28.6937227408861, 1988.03831525414, 1476.83143096418), decade_1 = c(83.4606306426572, 
537.725421951383, 10.2747132062782, 235.380422949258, 685.043600629146
), decade_2 = c(188.414375201462, 2091.84249935145, 17.080858894829, 
649.55107199935, 1805.3484565514), decade_3 = c(43.3316024314987, 
141.64396529835, 2.77851259926935, 94.7748265692319, 413.248354335235
), decade_4 = c(54.4891626582901, 451.076574268175, 12.4298374245007, 
346.102609621018, 769.215535857077), decade_5 = c(85.5621750431284, 
131.822699578988, 13.3130607062134, 151.002200923853, 387.727911723968
), decade_6 = c(112.860998806804, 4844.59668489898, 19.7317645111144, 
2084.76584309876, 766.375852567831), decade_7 = c(73.2198969730458, 
566.042952305845, 3.2457873699886, 311.853982701609, 768.801733767044
), decade_8 = c(91.8161648275608, 115.161700090147, 10.7289451320065, 
181.747670625714, 549.21661120626), decade_9 = c(123.31045087146, 
648.23694540667, 17.7690326882018, 430.301803845829, 677.187054208271
)), row.names = c("ANK1", "NTN4", "PTPRH", "JAG1", "PLAT"), class = "data.frame")

I would like to plot a line graph with the changes in counts over time for each of >30 genes as shown here in excel.

Graph of genes over time

To do this with ggplot I have to convert it to col1: decade, col2: gene, col3: counts. My question is, either how to convert my table into this ggplot friendly table, or if there is a better way to produce the plot with a different tool?

Thanks!

Upvotes: 2

Views: 202

Answers (1)

tjebo
tjebo

Reputation: 23737

One possibility: transpose your data frame, convert rownames to columns, then gather ("make long"). Plotting is then easy.

library(tidyverse)
mydat <- structure(list(decade_0 = c(92.500989948184, 2788.27384875413, 
                            28.6937227408861, 1988.03831525414, 1476.83143096418), decade_1 = c(83.4606306426572, 
                                                                                                537.725421951383, 10.2747132062782, 235.380422949258, 685.043600629146
                            ), decade_2 = c(188.414375201462, 2091.84249935145, 17.080858894829, 
                                            649.55107199935, 1805.3484565514), decade_3 = c(43.3316024314987, 
                                                                                            141.64396529835, 2.77851259926935, 94.7748265692319, 413.248354335235
                                            ), decade_4 = c(54.4891626582901, 451.076574268175, 12.4298374245007, 
                                                            346.102609621018, 769.215535857077), decade_5 = c(85.5621750431284, 
                                                                                                              131.822699578988, 13.3130607062134, 151.002200923853, 387.727911723968
                                                            ), decade_6 = c(112.860998806804, 4844.59668489898, 19.7317645111144, 
                                                                            2084.76584309876, 766.375852567831), decade_7 = c(73.2198969730458, 
                                                                                                                              566.042952305845, 3.2457873699886, 311.853982701609, 768.801733767044
                                                                            ), decade_8 = c(91.8161648275608, 115.161700090147, 10.7289451320065, 
                                                                                            181.747670625714, 549.21661120626), decade_9 = c(123.31045087146, 
                                                                                                                                             648.23694540667, 17.7690326882018, 430.301803845829, 677.187054208271
                                                                                            )), row.names = c("ANK1", "NTN4", "PTPRH", "JAG1", "PLAT"), class = "data.frame")


newdat <- mydat %>% t() %>% as.data.frame() %>% tibble::rownames_to_column('decade') %>%
  pivot_longer(-decade, names_to = 'gene', values_to = 'count')

ggplot(newdat) + geom_line(aes(decade, count, color = gene, group = gene))

Created on 2020-02-14 by the reprex package (v0.3.0)

Upvotes: 2

Related Questions