Johnathan
Johnathan

Reputation: 1907

Is there a way to plot a frequency histogram from a continuous variable?

I have DNA segment lengths (relative to chromosome arm, 251296 entries), as such:

0.24592963
0.08555043
0.02128725
...

The range goes from 0 to 2, and I would like to make a continuous relative frequency plot. I know that I could bin the values and use a histogram, but I would like to show continuity. Is there a simple strategy? If not, I'll use binning. Thank you!

EDIT:

I have created a binning vector with 40 equally spaced values between 0 and 2 (both included). For simplicity's sake, is there a way to round each of the 251296 entries to the closest value within the binning vector? Thank you!

Upvotes: 2

Views: 2319

Answers (1)

Nathan S. Watson-Haigh
Nathan S. Watson-Haigh

Reputation: 5123

Given that most of your values are not duplicated and thus don't have an easy way to derive a value for plotting on the y-axis, I'd probably go for a density plot. This will highlight dense segment lengths i.e. where you have lots of segment lengths occurring near each other.

d <- c(0.24592963, 0.08555043, 0.02128725)
plot(density(d), xlab="DNA Segment Length", xlim=c(0,2))

enter image description here

Upvotes: 4

Related Questions