mockash
mockash

Reputation: 1370

How to get cumulative sum based on a condition

I have a dataset with millions of values with 2 columns(ID, Amount). Amount is sorted in descending order. I need to get cumulative sum of amount based on a condition.

ID       Amount
101      40000
102      20000
103      15000
104      10000
......

For Example if there are 1000 rows I need the cumulative sum of first 1% i.e first 10 rows after sorting, then 4% (40), 15% (150), 35%(350) and below 50% (500).

How do I get this in R

Upvotes: 0

Views: 547

Answers (2)

Eric Lecoutre
Eric Lecoutre

Reputation: 1481

I would begin to ensure dataframe is sorted..., I assume you only want the aggregated cumsum, not the detail

percentage=0.1
cumsum(df$Amount)[round(quantile(0:nrow(df),percentage))]

Upvotes: 0

rbm
rbm

Reputation: 3253

Why not

data <- 1:1000
n <- length(data)
quantile <- 0.01 # cumsum top 1%
cumsum(data[1:floor(n*quantile)])

Upvotes: 1

Related Questions