Reputation: 7435
I'm trying to find the sum of each bin given a random vector, but the code is only returning the first element of the vector as 100. How would I cycle through each of the elements in the vector x
, check if it is range of bin j
, and return the sum for each bin?
I realize there are functions to do this in R
, but I'm working on hard coding this specific example.
# Sample data
set.seed(1234)
x <- rnorm(100)
S <- range(x)
a <- range(x)[1]
b <- range(x)[2]
J <- 5 #bins
h <- (b - a)/J #interval
for (j in 1:J){
for (n in 1:length(x)){
ifelse(x[n] > a + (j-1)*h & (x[n] <= a + j*h), n[j] <- n[j] + 1, n[j] <- n[j] + 0)
}
}
Output:
> n
[1] 100 NA NA NA NA
Desired Output:
> n
[1] 7 43 29 13 8
Upvotes: 1
Views: 47
Reputation: 73265
Why not use cut
and table
?
set.seed(1234)
x <- rnorm(100)
bin <- cut(x, breaks = 5) ## evenly cut `range(x)` into 5 bins
levels(bin)
# [1] "(-2.35,-1.37]" "(-1.37,-0.388]" "(-0.388,0.591]" "(0.591,1.57]"
# [5] "(1.57,2.55]"
table(bin)
# (-2.35,-1.37] (-1.37,-0.388] (-0.388,0.591] (0.591,1.57] (1.57,2.55]
# 7 43 29 13 8
Still, I need to show why your loop fails. Note that you don't need an ifelse
; ordinary if (...) ...
is sufficient. The error is that you used n
as loop index, but also use it to record counts! The following corrects this, by using a new vector counts
to distinguish with n
:
counts <- integer(J) ## initialization
for (j in 1:J){
for (n in 1:length(x)) {
if (x[n] > a + (j-1)*h && x[n] <= a + j*h) counts[j] <- counts[j] + 1L
}
}
counts
# [1] 6 43 29 13 7
Perhaps you have noted that the first value is 6
not 7
. This is because your loop condition x[n] > a + (j-1)*h && x[n] <= a + j*h
does not include the lowest value for the first bin. Since this is always the case, you need manually add a 1
to counts[1]
.
Upvotes: 2