Reputation: 103
I am facing a problem where I need to perform matrix multiplication between two large matrix A [400000 x 70000]
and B [70000 x 1000]
. The two matrices are dense and have no special structure that I can utilize.
Currently my implementation is to divide A
into multiple chunks of rows, say, sub_A [2000 x 70000]
and perfrom sub_A * B
. I noticed that there are a lot of time is spent on I/O, i.e. read in the sub_A
. Read in the matrix takes about 500 seconds and computation takes about 300 seconds.
Will using PyTables here be useful to improve the I/O efficiency? Are there any library that will help in improving the time efficiency?
Here is the code:
def sim_phe_g(geno, betas, chunk_size):
num_indv = geno.row_count
num_snps = geno.col_count
num_settings = betas.shape[1]
phe_g = np.zeros([num_indv, num_settings])
# divide individuals into chunks
for i in range(0, num_indv, chunk_size):
sub_geno = geno[i : i + chunk_size, :]
sub_geno = sub_geno.read().val
phe_g[i : i + chunk_size, :] = np.dot(sub_geno, betas)
return phe_g
geno
is of size [400000 x 70000]
and betas
is of size [70000 x 1000]
. geno
here is a large matrix that is stored in disk. The statement sub_geno = sub_geno.read().val
will load a chunk of the genotype into the memory. And this statement costs a lot of time.
Also, I divide the big matrix into chunks because of 32GB memory size limitation.
Upvotes: 2
Views: 1427
Reputation: 3816
If applicable try using tensorflow for large matrices multiplication, as you can see from this article that tensorflow performs significantly better in case of large matrices under many circumstances. The reason for the same most likely being that its primarily built for this very purpose of handling large matrices efficiently.
for more details on the specific use of matrix multiplication kindly refer to the documentation.
I tested it on a (1000,1000) matrix for multiplication:
for numpy.matmul
= 60 ms ± 5.35
for tensorflow.matmul
= 42.5 ms ± 2.47 m
100 runs for each were conducted sharing mean and stdev
P.S. Tensorflow's cpu version was only used
Upvotes: -1
Reputation: 1
Try using TensowFlow for GPU optimization, it's very good for matrix multiplication as it will allow you to parallelize each operation.
Upvotes: -2