Ekaterina
Ekaterina

Reputation: 69

median in data.table R

I try to write a code to perform a task: "Write a function purchases.median.order.price, which has one argument: purchases, and which returns the median order value (number).

Grouping should be done using data.table. Records with a non-positive amount of purchased goods (returns) are ignored.

Please note that one record can correspond to several records - “positions” with the same ordernumber, and that when calculating the order value, it is necessary to take into account situations when the user bought several goods of the same type (their quantity is indicated in quantity)."

sample.purchases <- data.table(price = c(100000, 6000, 7000, 5000000),
                               ordernumber = c(1,2,2,3),
                               quantity = c(1,2,1,-1),
                               product_id = 1:4)
purchases.median.order.price(sample.purchases)
# 59500

I write:

library(data.table)
sample.purchases <- data.table(price = c(100000, 6000, 7000, 5000000),
                               ordernumber = c(1,2,2,3),
                               quantity = c(1,2,1,-1),
                               product_id = 1:4)

sample.purchases[quantity>0][, price*quantity, by=ordernumber]

But it's wrong. I don't know how should I find out median?

Upvotes: 0

Views: 176

Answers (1)

Oliver
Oliver

Reputation: 8572

Manually by hand:

purchases.median.order.price <- function(x){
  x <- order(x);
  n <- length(x) - 1;
  n2 <- (n/2) + 1; 
  sum(x[c(floor(n2), ceiling(n2))])/2
}

Alternative you could write a function that just calls median or quantile.

Upvotes: 1

Related Questions