Reputation: 1370
I have a dataset with millions of values with 2 columns(ID, Amount). Amount is sorted in descending order. I need to get cumulative sum of amount based on a condition.
ID Amount
101 40000
102 20000
103 15000
104 10000
......
For Example if there are 1000 rows I need the cumulative sum of first 1% i.e first 10 rows after sorting, then 4% (40), 15% (150), 35%(350) and below 50% (500).
How do I get this in R
Upvotes: 0
Views: 547
Reputation: 1481
I would begin to ensure dataframe is sorted..., I assume you only want the aggregated cumsum, not the detail
percentage=0.1
cumsum(df$Amount)[round(quantile(0:nrow(df),percentage))]
Upvotes: 0
Reputation: 3253
Why not
data <- 1:1000
n <- length(data)
quantile <- 0.01 # cumsum top 1%
cumsum(data[1:floor(n*quantile)])
Upvotes: 1