Reputation: 53
I have a frequency table, describing the lengths, which I would like to plot as a line graph preferably using ggplot2. But the table has 13373 observations, which makes it difficult to plot all the points. So is there any way I can consolidate the plot to less number of observations to get a good looking plot.
The head of the dataframe
Length Freq
1 151 1
2 152 1
3 159 1
4 168 2
5 174 1
6 177 1
The summary of the length variable:
Min. 1st Qu. Median Mean 3rd Qu. Max.
151 1692 4624 9795 9921 834300
I am basically looking for a plot similar to this
Thanks a lot, Karthic K
Upvotes: 0
Views: 69
Reputation: 1177
Yes you can! Gene length is basically a numerical variable so you could bin/cut it into groups and aggregate like this:
df %>%
mutate(Length_bin = cut(Length, breaks = 100) %>%
groub_by(Length_bin) %>%
summarise(Freq = sum(Freq))
You can define the number of breaks or manually input cut points.
Upvotes: 1