Reputation: 699
Suppose I have the following data frame dat
in R:
a <- c(6,6,6,7,9,10,11,13,16,17,19,20,22,10,23,25,32,32,34,35,6)
b <- c(1,1,0,1,0,1,0,1,1,0,0,0,1,0,1,0,0,0,0,0,1)
q <- c(3,2,4,2,4,2,2,3,0,5,0,6,1,0,3,4,9,1,2,0,12)
dat <- data.frame(a, b, q)
dat <- dat[order(dat$a),]
For each level of a
, I have to compute f = with(dat, sum(q[a==that level]*b[a==that level])/sum(b[a==that level]))
. For example, for a=6
, the f=(3*1+2*1+4*0+12*1)/(3+2+4+12)=17/21
.
So I have written the codes as following:
dat$qb <- dat$q * dat$b
v1 <- as.numeric(by(dat$qb, dat$a, FUN=sum))
dat$sqb <- rep(v1, table(dat$a))
v2<-as.numeric(by(dat$q, dat$a, FUN=sum))
dat$sumq <- rep(v2, table(dat$a))
dat$f <- dat$sqb/dat$sumq
Now I need only the columns a
and f
:
dat <- unique(dat[,c(1,7)])
My next step is to calculate the cumulative sum of the equation f = with(dat, sum(q[a==that level]*b[a==that level])/sum(b[a==that level]))
:
dat <- replace(dat, is.na(dat), 0)
dat$F <- cumsum(dat$f)
Ultimately, I need the maximum value of F
for which a<=18
:
Ft <- max(dat$F[dat$a<=18])
This is a part of a simulation and this part target is to find the value of Ft <- max(dat$F[dat$a<=18])
. I think there is faster way to write these chunk of codes.
Ft
will be appreciated.Upvotes: 2
Views: 212
Reputation: 887291
We can use tidyverse
to group by 'a', calculate 'f' by multiplying 'b' with 'q', get the sum
and divide by the sum
of 'q'. Create 'F' as the cumsum
of 'f' after replace
ing NA with 0.
library(tidyverse)
dat1 <- dat %>%
group_by(a) %>%
summarise(f = sum(b*q)/sum(q)) %>%
mutate(F = cumsum(replace(f, is.na(f), 0)))
Filter the rows where 'a' is less than or equal to 18, extract the 'F' column and get the max
dat1 %>%
filter(a <= 18) %>%
.$F %>%
max
#[1] 3.809524
Or another option is data.table
library(data.table)
setDT(dat)[, .(f= sum(b*q)/sum(q)) , a][is.na(f), f := 0
][, F := cumsum(f)][a <= 18, max(F)]
#[1] 3.809524
Upvotes: 2