Terry Loring
Terry Loring

Reputation: 113

A Python LDLT factorization for sparse Hermitian matrices, is there one?

I have some large (N-by-N for N=10,000 to N=36,000,000) sparse, complex, Hermitian matrices, typically nonsingular, for which I have a spectral slicing problem. Specifically I need to know the exact number of positive eigenvalues.

I require a sparse LDLT decomposition -- is there one? Ideally, it will be a multifrontal algorithm and so well parallelized, and have the option of computing only D and not the upper triangular or permutation matrices.

I currently use   ldl()   in Matlab.   This only works for real matrices so I need to create a larger real matrix. Also, it always computes L as well as D. I need a better algorithm to fit with 64GB RAM. I am hoping Python will be more customizable. (If so, I will learn Python.) I should add: I can get 64GB RAM per node, and can get 6 nodes. Even with a single machine with 64GB RAM, I would like to stop wasting RAM storing L only to delete it.

Perhaps someone wrote a Python front-end for MUMPS (MUltifrontal Massively Parallel Solver)?

I would have use for a non-parallel Python version of LDLT, as a lot of my research involves many matrices of moderate size.

Upvotes: 1

Views: 448

Answers (1)

user3666197
user3666197

Reputation: 1

I need a better algorithm to fit with 64GB RAM. I am hoping Python will be more customizable. (If so, I will learn Python.)

If this were ever possible:

|>>> ( 2. * 8 ) * 10E3 ** 2 / 1E12            # a 64GB RAM can store matrices ~1.6 GB
0.0016                                        #        the [10k,10k] of complex64
|                                             #        or  [20k,20k] of    real64
|>>> ( 2. * 8 ) * 63E3 ** 2 / 1E12            # a 64GB RAM can store matrices in-RAM
0.063504                                      #        max [63k,63k] of type complex
|                                             #        but
|>>> ( 2. * 8 ) * 36E6 ** 2 / 1E12            #        the [36M,36M] one has
20736.0                                       # ~ 21PB of data
+0:00:09.876642                               #        Houston, we have a problem
#---------------------------------------------#--------and [2M7,2M7] has taken    
                                                                   ~ month
                                                                  on HPC cluster

Research needs are clear, yet there is no such language ( be it Matlab, Python, assembler, Julia or LISP ) that can ever store 21 petabytes of data into a space of just 64 gigabytes of physical RAM storage, to make a complex matrix ( of such a given scale ) eigenvalues computations possible and as fast as possible. By this I also mean, that "off-loading" of data from in-RAM computation into any form of out-of-RAM storage is so prohibitively expensive ( about ~ +1E2 ~ +1E5 ~ orders of magnitude slower ) that any such computational process will yield ages for just first "reading through" 21 PB of elements' data.

If your research has funding or sponsoring for using rather very specific computing devices infrastructure, there might be some tricks to process such heaps of numbers, yet do not expect to receive 21 PB of worms ( data ) put into a 64 GB large ( well, rather small ) can "for free".

You may enjoy Python due to many other reasons and/or motivations, but not due to any cheaper while faster HPC-grade parallel-computing, nor making easily process 21PB of data inside a 64GB device, nor for any kind of annihilating the principal and immense [TIME]-domain add-on costs of sparse-matrix manipulations visible but during their use in computations. Having made some xTB sparse-matrix processing feasible to yield results in less than 1E2 [min] instead of 2E3, I dare say I know how immensely hard it is to increase both the [PSPACE]-data scaling and to shorten the often [EXPTIME] processing durations at the same time ... the truly Hell-corner of the computational-complexity ... where the sparse-matrix representation will often create even more headaches ( again, both in [SPACE] and more, well, worse in [TIME] as new types of penalties appear ) than helping to at least enjoy some, potential [PSPACE]-savings

Given the scope of parameters, I may safely bet for sure, even the algorithm part will not help and even the promises of Quantum-Computing devices will remain for our lifetime expectancy unable to augment such vast parameter-space into the QC annealer-based unstructured quantum-driven minimiser ( processing ) for any reasonably long ( short duration ) sequence of parameter-blocks translation into a ( limited physical size ) qubit-field problem augmentation-process, as is currently in use by the QC community, thanks to LLNL et al research innovations.

Sorry, there no such magic seems to be anywhere near.

Using the available python frontends for MUMPS does not change the HPC-problem of the game, but if you wish to use it, yes, there are several of these available.

The efficient HPC-grade number-crunching at scale is still the root-cause of the problems with the product of [ processing-time ] x [ (whatever) data-representation's efficient storage and retrieval ].

Hope you will get and enjoy the right mix of a comfort ( pythonic users are keen to stay in ) and yet the HPC-grade performance ( of the backend of whatever type ) you wish to have.

Upvotes: 1

Related Questions