Reputation: 69
I have a dataset with mean gene counts for each decade as shown below:
structure(list(decade_0 = c(92.500989948184, 2788.27384875413,
28.6937227408861, 1988.03831525414, 1476.83143096418), decade_1 = c(83.4606306426572,
537.725421951383, 10.2747132062782, 235.380422949258, 685.043600629146
), decade_2 = c(188.414375201462, 2091.84249935145, 17.080858894829,
649.55107199935, 1805.3484565514), decade_3 = c(43.3316024314987,
141.64396529835, 2.77851259926935, 94.7748265692319, 413.248354335235
), decade_4 = c(54.4891626582901, 451.076574268175, 12.4298374245007,
346.102609621018, 769.215535857077), decade_5 = c(85.5621750431284,
131.822699578988, 13.3130607062134, 151.002200923853, 387.727911723968
), decade_6 = c(112.860998806804, 4844.59668489898, 19.7317645111144,
2084.76584309876, 766.375852567831), decade_7 = c(73.2198969730458,
566.042952305845, 3.2457873699886, 311.853982701609, 768.801733767044
), decade_8 = c(91.8161648275608, 115.161700090147, 10.7289451320065,
181.747670625714, 549.21661120626), decade_9 = c(123.31045087146,
648.23694540667, 17.7690326882018, 430.301803845829, 677.187054208271
)), row.names = c("ANK1", "NTN4", "PTPRH", "JAG1", "PLAT"), class = "data.frame")
I would like to plot a line graph with the changes in counts over time for each of >30 genes as shown here in excel.
To do this with ggplot I have to convert it to col1: decade, col2: gene, col3: counts. My question is, either how to convert my table into this ggplot friendly table, or if there is a better way to produce the plot with a different tool?
Thanks!
Upvotes: 2
Views: 202
Reputation: 23737
One possibility: transpose your data frame, convert rownames to columns, then gather ("make long"). Plotting is then easy.
library(tidyverse)
mydat <- structure(list(decade_0 = c(92.500989948184, 2788.27384875413,
28.6937227408861, 1988.03831525414, 1476.83143096418), decade_1 = c(83.4606306426572,
537.725421951383, 10.2747132062782, 235.380422949258, 685.043600629146
), decade_2 = c(188.414375201462, 2091.84249935145, 17.080858894829,
649.55107199935, 1805.3484565514), decade_3 = c(43.3316024314987,
141.64396529835, 2.77851259926935, 94.7748265692319, 413.248354335235
), decade_4 = c(54.4891626582901, 451.076574268175, 12.4298374245007,
346.102609621018, 769.215535857077), decade_5 = c(85.5621750431284,
131.822699578988, 13.3130607062134, 151.002200923853, 387.727911723968
), decade_6 = c(112.860998806804, 4844.59668489898, 19.7317645111144,
2084.76584309876, 766.375852567831), decade_7 = c(73.2198969730458,
566.042952305845, 3.2457873699886, 311.853982701609, 768.801733767044
), decade_8 = c(91.8161648275608, 115.161700090147, 10.7289451320065,
181.747670625714, 549.21661120626), decade_9 = c(123.31045087146,
648.23694540667, 17.7690326882018, 430.301803845829, 677.187054208271
)), row.names = c("ANK1", "NTN4", "PTPRH", "JAG1", "PLAT"), class = "data.frame")
newdat <- mydat %>% t() %>% as.data.frame() %>% tibble::rownames_to_column('decade') %>%
pivot_longer(-decade, names_to = 'gene', values_to = 'count')
ggplot(newdat) + geom_line(aes(decade, count, color = gene, group = gene))
Created on 2020-02-14 by the reprex package (v0.3.0)
Upvotes: 2