KarthicdotK
KarthicdotK

Reputation: 53

Consolidate the dataframe to plot line graph in R

I have a frequency table, describing the lengths, which I would like to plot as a line graph preferably using ggplot2. But the table has 13373 observations, which makes it difficult to plot all the points. So is there any way I can consolidate the plot to less number of observations to get a good looking plot.

The head of the dataframe

  Length Freq
1  151    1
2  152    1
3  159    1
4  168    2
5  174    1
6  177    1

The summary of the length variable:

 Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
    151    1692    4624    9795    9921  834300 

I am basically looking for a plot similar to this line chart

Thanks a lot, Karthic K

Upvotes: 0

Views: 69

Answers (1)

Fnguyen
Fnguyen

Reputation: 1177

Yes you can! Gene length is basically a numerical variable so you could bin/cut it into groups and aggregate like this:

df %>%
mutate(Length_bin = cut(Length, breaks = 100) %>%
groub_by(Length_bin) %>%
summarise(Freq = sum(Freq))

You can define the number of breaks or manually input cut points.

Upvotes: 1

Related Questions