spiral01
spiral01

Reputation: 545

Binning data by order with selected bin size

I have a vector of values that I want to order by value in descending order, then bin in bins of size 100, with the final bin containing all of the remaining values.

#generate random data
set.seed(1)
x <- rnorm(8366)

#In descending order
y <- x[order(-x)]

Now I have used cut to bin by value before, but I want the bins to be of finite size. So the first bin will have the first 100 values in y, the second bin the next hundred etc until I have ten bins, with the final bin containing all of the remaining values. I am not sure how to go about this.

Upvotes: 1

Views: 552

Answers (3)

moodymudskipper
moodymudskipper

Reputation: 47300

You can use cut :

res <- cut(y,c(rev(y)[seq(1,901,100)],Inf),right = F)
table(res)
# res
# [-3.67,-2.33) [-2.33,-2.05) [-2.05,-1.87) [-1.87,-1.72)  [-1.72,-1.6) 
#           100           100           100           100           100 
#   [-1.6,-1.5)  [-1.5,-1.41) [-1.41,-1.34) [-1.34,-1.27)   [-1.27,Inf) 
#           100           100           100           100          7466 

Upvotes: 0

mavery
mavery

Reputation: 101

I'm not sure what you mean by "bin". Do you want to summarize each 100 values in some way? For example, sum them? If so, here's one solution:

#generate random data
set.seed(1)
x <- rnorm(8836)

n <- ceiling(length(x)/100) * 100
y <- rep(0, n)

#In descending order
y[1:length(x)] <- x[order(-x)]

X <- matrix(y, nrow = , ncol = 100, byrow = T) 
apply(X, 2, sum)

Upvotes: 1

LyzandeR
LyzandeR

Reputation: 37879

The below will return the bins as a list:

mylist <- split(y, c(rep(1:9, each = 100), rep(10, 8366 - 900)))

The first 9 elements contain 100 records each and the rest are stored in the 10th element.

Upvotes: 2

Related Questions