Rilcon42
Rilcon42

Reputation: 9763

get sum of the product of two columns

Can someone explain how my order of operations for the R fuctions below is wrong? I expected to multiply across each row then sum the result, but the answer I got was very different.

qqq<-data.frame(c(1,2,3),c(4,5,6))
library(dplyr)
qqq%>%sum(.[,1]*.[,2]) #returns: 53

# answer I expected: 1*4+2*5+3*6 = 32

Upvotes: 3

Views: 1964

Answers (2)

thelatemail
thelatemail

Reputation: 93813

I think a brief example explains what is going on here, even if I don't know why it occurs in terms of the underlying code:

The entire dataset is being passed to the sum operation, and then this is being summed together with the expression inside the sum operation:

Equivalent to sum(data.frame(1:3)) in the simplest example:

data.frame(1:3) %>% sum()
#[1] 6

Then 6 + 6:

data.frame(1:3) %>% sum(.[1])
#[1] 12

Now, below the sum of the input dataset is 12, so the result is 12 + 6:

data.frame(1:3,1:3) %>% sum(.[1])
#[1] 18

Adding both columns to the sum gives 12 + 6 + 6:

data.frame(1:3,1:3) %>% sum(.[1],.[2])
#[1] 24

And adding a multiplication gives 12 + sum(1:3 * 1:3) = 12 + 14

data.frame(1:3,1:3) %>% sum(.[1]*.[2])
#[1] 26

Upvotes: 1

akrun
akrun

Reputation: 887118

We can use Reduce to get the rowwise product and then do the sum

qqq %>%
     Reduce(`*`, .) %>%
     sum
#[1] 32

Or as @eipi mentioned in the comments, using {} can also work

qqq %>%
   {.[,1]*.[,2]} %>%
   sum

Or with do (from @thelatemail)

qqq %>% 
    do(.[1]*.[2]) %>%
    sum

Upvotes: 4

Related Questions