Reputation: 1145
Let say I’ve a data frame consists of one variable (x)
df <- data.frame(x=c(1,2,3,3,5,6,7,8,9,9,4,4))
I want to know how many numbers are less than 2,3,4,5,6,7. I know how to do this manually using
# This will tell you how many numbers in df less than 4
xnew <- length(df[ which(df$x < 4), ])
My question is how can I automate this by using for-loop or other method(s)? And I need to store the results in an array as follows
i length
2 1
3 2
4 4
5 6
6 7
7 8
Thanks
Upvotes: 1
Views: 1281
Reputation: 24480
A vectorized solution:
findInterval(2:7*(1-.Machine$double.eps),sort(df$x))
The .Machine$double.eps
part assure that you are taking just the numbers lower than and not lower or equal than.
Upvotes: 1
Reputation: 886948
One way would be to loop over (sapply
) the numbers (2:7
), check which elements in df$x
is less than (<
) the "number" and do the sum
, cbind
with the numbers, will give the matrix
output
res <- cbind(i=2:7, length=sapply(2:7, function(y) sum(df$x <y)))
Or you can vectorize by creating a matrix
of numbers (2:7
) with each number replicated by the number of rows of df
, do the logical operation <
with df$x
. The logical operation is repeated for each column of the matrix, and get the column sums using colSums
.
length <- colSums(df$x <matrix(2:7, nrow=nrow(df), ncol=6, byrow=TRUE))
#or
#length <- colSums(df$x < `dim<-`(rep(2:7,each=nrow(df)),c(12,6)))
cbind(i=2:7, length=length)
Upvotes: 3
Reputation: 31161
num = c(2,3,4,5,6,7)
res = sapply(num, function(u) length(df$x[df$x < u]))
data.frame(number=num,
numberBelow=res)
Upvotes: 1