Reputation: 121
I have a vector called ranks
with values from -6 to 6 and I want to create intervals of number of values based on value intervals such as (2, Inf) [2, 1.25) etc that include the number of values in that interval, plus the ones in the previous interval. In other words, for interval of values (2, Inf) I want to obtain the number of values in this interval plus those in the previous one.
To do this I used a very primitive approach:
xmin <- c(0, sum(ranks>2),
sum(ranks>2) + sum(ranks>1.25),
sum(ranks>2) + sum(ranks>1.25) + sum(ranks>0.75),
sum(ranks>2) + sum(ranks>1.25) + sum(ranks>0.5) + sum(ranks>0.25))
xmax <- c(c(sum(ranks>2),
sum(ranks>2) + sum(ranks>1.25),
sum(ranks>2) + sum(ranks>1.25) + sum(ranks>0.75),
sum(ranks>2) + sum(ranks>1.25) + sum(ranks>0.5) + sum(ranks>0.25))-1,
length(ranks))
Where xmin is the start of each interval and xmax the end of each interval. But I believe there is a much better straightforward way to do it.
Overall I'd like to find the values separating in this intervals: +Inf, 2, 1, 0.5, 0, -0.5, -1, -2, -Inf
PS: I'll be using these to annotate x axis in ggplot2 as shown below (see the color scale from red to blue, those are rectangles with specific x and y delimiters)
Upvotes: 2
Views: 199
Reputation: 23767
I might still have misunderstood, but in the end it's just a matter of counting how many genes are in each interval, correct?
Basically you're re-inventing a stacked bar.
library(ggplot2)
# random data
set.seed(1)
rank <- sample(-6:6, 11000, replace = T)
# vector of your cuts
my_cuts <- c(-Inf, -2, -1, -0.5, 0, 0.5, 1, 2, Inf)
## make a data frame and cut the ranks
genes <- data.frame(rank)
genes$cuts <- cut(genes$rank, my_cuts)
## just use geom_bar
ggplot(genes) +
geom_bar(aes(y = 1, fill = cuts)) +
## now you can simply use one of the scale functions
scale_fill_brewer(palette = "Reds")
Created on 2022-05-31 by the reprex package (v2.0.1)
Upvotes: 1
Reputation: 2532
Try this:
c(1, 1, 2, 3, 2, 1, 4, 2, 5, 6, 2, 5, 3) %>%
cut(c(0, 2, 4, 6)) %>%
table() %>%
cumsum()
(0,2] (2,4] (4,6]
7 10 13
UPD: just noticed - you're arranging your intervals in reverse. I think the easiest way would be to convert the table to numeric and rev
erse it before cumsum
ing
Upvotes: 2