R Using factor in a function

Question

Im having some troubles using factors in functions, or just to make use of them in basic calculations. I have a data-frame something like this (but with as many as 6000 different factors).

df<- data.frame( p <- runif(20)*100,
q = sample(1:100,20, replace = T),
tt = c("e","e","f","f","f","i","h","e","i","i","f","f","j","j","h","h","h","e","j","i"),
ta = c("a","a","a","b","b","b","a","a","c","c","a","b","a","a","c","c","b","a","c","b"))
colnames(df)<-c("p","q","ta","tt")

Now price = p and quantity = q are my variables, and tt and ta are different factors.

Now, I would first like to find the average price per unit of q by each different factor in tt

(p*q ) / sum(q) by tt

This would in this case give me a list of 3 different sums, by a, b and c (I have 6000 different factors so I need to do it smart :) ).

I have tried using split to make lists, and in this case i can get each individual tt factor to contain the prices and another for the quantity, but I cant seem to get them to for example make an average. I've also tried to use tapply, but again I can't see how I can incorporate factors into this?

EDIT: I can see I need to clearify:

I need to find 3 sums, the average price pr. q given each factor, so in this simplified case it would be:

a: Sum of p*q for (Row (1,2,3, 7, 11, 13,14,18) / sum (q for row Row (1,2,3, 7, 11, 13,14,18)

So the result should be the average price for a, b and c, which is just 3 values.

alap · Accepted Answer

If I understood corectly you'r problem this should be the answer. Give it a try and responde, that I can adjust it if it's needed.

myRes <- function(tt) {

  out <- NULL;
  qsum <- sum(as.numeric(df[,"q"]))
  psum <- sum(as.numeric(df[,"p"]))
  for (var in tt) {
    index <- which(df["tt"] == var)

    out <- c(out, ((qsum *psum) / sum(df[index,"q"])))
  }
  return (out)
}

threeValue <- myRes(levels(df[, "tt"]));

R Using factor in a function

Answers (2)

Related Questions