sahil
sahil

Reputation: 23

R: Comparing values in vector to column in data frame

Apologies if this has been asked before, but I've searched for a while and can't find anything to answer my question. I'm somewhat comfortable using R but never really learned the fundamentals. Here's what I'm trying to do.

I've got a vector (call it "responseTimes") that looks something like this:

150  50 250  200  100  150  250  

(It's actually much longer, but I'm truncating it here.)

I've also got a data frame where one column, timeBin, is essentially counting up by 50 from 0 (so 0 50 100 150 200 250 etc).

What I'm trying to do is to count how many values in responseTimes are less than or equal to each row in the data frame. I want to store these counts in a new column of my data frame. My output should look something like this:

timeBin    counts
0          0
50         1
100        2
150        4
200        5
250        7

I know I can use the sum function to compare vector elements to some constant (e.g., sum(responseTimes>100) would give me 5 for the data I've shown here) but I don't know how to do this to compare to a changing value (that is, to compare to each row in the timeBin column).

I'd prefer not to use a loop, as I'm told those can be particularly slow in R and I have quite a large data set that I'm working with. Any suggestions would be much appreciated! Thanks in advance.

Upvotes: 2

Views: 2861

Answers (2)

S.C
S.C

Reputation: 740

That might help:

responseTimes <- c(150, 50, 250, 200, 100, 150, 250)
bins1 <- seq(0, 250, by = 50)


sahil1 <- function(input = responseTimes, binsx = bins1) {
    tablem <- table(cut(input, binsx)) # count of input across bins
    tablem <- cumsum(tablem) # cumulative sums
    return(as.data.frame(tablem)) # table to data frame
}

Upvotes: 1

Jilber Urbina
Jilber Urbina

Reputation: 61214

You can use sapply this way:

> timeBin <- seq(0, 250, by=50)
> responseTimes <- c(150,  50, 250,  200,  100,  150,  250 )
> 
> # using sapply (after all `sapply` is a loop)
> ans <- sapply(timeBin, function(x)  sum(responseTimes<=x))
> data.frame(timeBin, counts=ans)  # your desired output.
  timeBin counts
1       0      0
2      50      1
3     100      2
4     150      4
5     200      5
6     250      7

Upvotes: 3

Related Questions