user3418953
user3418953

Reputation: 1

In R: how can I program this sum...with for?

My problem is in R I start from a dataframe, where I have 2 variables z and p (p are the weights) I need this sum

∑_i ∑_j ((z_i - z_j)·p_i·p_j·I_z)

Where I_z is an indicator, if z_i < z_j it is = -1, =1 otherwise please consider that the data are big, dataframe could have also 10000 rows I try with matrix but I have a problem of memory I think to be obliged to use for loops... any suggestion ? thank you Elena

Upvotes: 0

Views: 65

Answers (1)

Roland
Roland

Reputation: 132969

Your "indicator" is just a fancy way of defining the abs function.

You can use outer is you have sufficient RAM:

set.seed(2)
n <- 2
DF <- data.frame(z=sample(1:2, n, TRUE),
                 p=sample(1:2, n, TRUE))
#  z p
#1 1 2
#2 2 1

sum(outer(seq_len(nrow(DF)), seq_len(nrow(DF)), function(i, j) {
  abs(DF$z[i] - DF$z[j]) * DF$p[i] * DF$p[j] 
}))
#[1] 4

n <- 1e4
DF <- data.frame(z=sample(1:2, n, TRUE),
                 p=sample(1:2, n, TRUE))

sum(outer(seq_len(nrow(DF)), seq_len(nrow(DF)), function(i, j) {
  abs(DF$z[i] - DF$z[j]) * DF$p[i] * DF$p[j]   
}))
#[1] 112224330

If you don't, you need a loop. Using combn is one possibility, but it is slow since it is basically a loop:

2 * sum(combn(seq_len(nrow(DF)), 2, function(ind) {
  abs(z[ind[1]] - z[ind[2]]) * p[ind[1]] * p[ind[2]]
}))
#[1] 112224330

Upvotes: 2

Related Questions