Miguel 2488
Miguel 2488

Reputation: 1440

How to split in chunks (submatrices), or handle a huge matrix giving memory error on numpy?

I have a really large matrix that won't simply fit into memory. The matrix i have to work with has 483798149136 elements, this means 483 billions of floating point numbers.

The approach i was thinking about was to split this huge matrix somehow in different submatrices that fit into memory, perform pooling operations on these submatrices, and later join them all back to rebuild the original matrix which will hopefully fit into memory after all the pooling operations.

Please correct me if i'm wrong, this approach was just an idea i came out with, how good or bad it is, i dont know. If you have any better alternative ideas i'm open to any suggestions.

The code to reproduce this matrix would be:

a = np.arange(695556).reshape(834,834)
np.meshgrid(a,a)

I have been reading this post and this post, among other ones in this same site but none of them provides a true solution to these kind of problems, they just give vague suggestions.

My questions now are:

  1. Is my splitting and pooling approach feasable or are there any other better ways of doing this?

  2. How (in code terms) could i split this matrix into pieces (like windows or multidimensional kernels) and rebuild it up again later

  3. Is there some way to process a matrix in chunks in numpy to perform operations with the matrix latyer, like multiplication, addition, etc...

  4. Is there a specific package in Python that helps dealing with this kind of matrix problems

EDIT

Since some users are asking me about the goal of this whole operation, i'll provide some info:

I'm working on a some 3d printing project. In the process there is a laser beam that melts metal powder to create complex metal pieces. In this pieces there are layers, and the laser melts the metal layer by layer.

I have 3 csv files, each one containing a matrix of 834 x 834. The first matrix contain the coordinates values of the X axis when laser beam is going through powder bed and melting the metal, the second matrix is the same for the Y axis, and the third matrix represents the time the laser stands melting in the same pixel point. The values are expressed in seconds.

So i have the coordinates of the laser passing through the X and Y axes, and the time it takes to melt each point.

This matrices come out from images of the sections of each manufactured piece.

The issue is that the temperature in a certain pixel and the time the laser stands at that pixel can have an influence over the n pixel when the laser gets there. So i want to create a distance matrix that tells me how different or similar in terms of euclidean distance are each pixel of the image to each other.

This is why if i have for instance 2 834 x 834 matrices i need to create a matrix of 695556 x 695556 with the distances between every single point in the matrix to each other. And this is why is so huge and will not fit into memory.

I don't know if i gave too much information, or if my explanations are messy. You can ask whatever you need and i'll try to clarify it, but the main point is that i need to create this huge distance matrix in ordr to know the mathematical distances between pixels and then get to know the relation between what's happening in a certain point of the piece when printing it and what it needs to happen in other points to avoid manufacturing defects.

Thank you very much in advance

Upvotes: 4

Views: 2236

Answers (2)

Gediz GÜRSU
Gediz GÜRSU

Reputation: 636

Since matrix calculations are row by row or column by column I think you can easily reduce them. Consider 200 25x50 matricies to multiply 50x100 matricies you should get 200 times 25x100 matricies right ? First you can split 200 to n and for instance multiply 20 at times and then concat the resulting numpy arrays. Better than that you can use 50 x m instead of 50x100 and numpy multiply 25x50 with 50x10 for example then again concatenate (reshape join reshape again or use advanced indices) .

Here comes the part worth mentioning: For instance you have two 10E9x10E9 matricies and want to multiply them. Split them into 20x20 subparts and multiply them, after that sum and reshape the results. Suppose that you have a gpu with little memory , I believe you can stream data to its memory in a process, while you make it multiply previous chunk of data using multiprocessing. I have never tried that but I am goin to try it with a GT 1030 or 6500 XT using cupy or pyopencl and will edit this post and add that as an example.

There is also a way that doesn't involve re-inventing all that. https://medium.com/analytics-vidhya/matrix-multiplication-at-scale-using-map-reduce-d5dc16710095

Upvotes: 1

Miguel 2488
Miguel 2488

Reputation: 1440

after all, i figured out a way to solve my problem. This huge matrices can be easily handled using dask. Dask is a python library that allows distributed computing and data division into chunks to optimize memory usage. It's pretty handy since it really allows you to work with literally huge and massive data at real low computing and memory cost, obviously, is not as fast as in-memory computing, but i think very people will be glad to know about this.

This package is pretty good optimized and is often update. The best of it is that it has numpy/pandas sintax, it also works with dataframes the same way than with arrays and if you know pandas/numpy you will feel like in home with dask.

You can create a dask distributed array like this:

import numpy as np
import dask.array as da

Y = da.random.normal(size=(695556, 695556),
                         chunks=(1000, 1000))

and then, you can perform some operations on it like this:

y = Y.mean(axis=0)[0:100].compute()

Also, if you use the memory_profiler package you can also monitor your memory and CPU usage, and see how much memory are your huge data consuming for computations.

There are some practical examples i found very illustrative here.

Also, the explanative array scope in this library can be found here.

And lastly, a guide about high performance computations in Python 3.X here.

Hope this helps someone with this same issue.

Upvotes: 3

Related Questions