Shiv
Shiv

Reputation: 369

Multiply each row with whole data table efficiently for large data size

Trying to multiple two data tables in a custom way. 'dt_data' is base data and need to be multiple from each row of 'dt_matrix'. For small set, for loop is OK. How to efficiently multiple when row count goes to million

dt_data <- data.table('A' = c(1,3,2,1), 'B' = c(2,3,1,4))

dt_matrix <- data.table('A' = c(4,5), 'B' = c(3,2))


    A    B
1:  4    6

2: 12    9

3:  8    3

4:  4   12

5:  5    4

6: 15    6

7: 10    2

8:  5    8

Upvotes: 1

Views: 60

Answers (2)

akrun
akrun

Reputation: 887108

An option is to use outer on each of the corresponding columns of the dataset and convert that to data.table

setDT(data.frame(Map(function(x, y) c(outer(x, y)), dt_data,dt_matrix)))[]
#    A  B
#1:  4  6
#2: 12  9
#3:  8  3
#4:  4 12
#5:  5  4
#6: 15  6
#7: 10  2
#8:  5  8

Or another option is crossing from tidyverse

library(tidyverse)
crossing(dt_data, setnames(dt_matrix, c('A1', 'B1'))) %>%
       transmute(A = A * A1, B = B* B1)

Upvotes: 1

Sotos
Sotos

Reputation: 51592

An idea is to replicate all the rows of your data tables and multiply. However, this way the order changes a bit but It should be more efficient, i.e.

 mapply(`*`, dt_data[rep(seq_len(nrow(dt_data)), each = nrow(dt_matrix)),], dt_matrix)

#      A  B
#[1,]  4  6
#[2,]  5  4
#[3,] 12  9
#[4,] 15  6
#[5,]  8  3
#[6,] 10  2
#[7,]  4 12
#[8,]  5  8

Upvotes: 0

Related Questions