Reputation: 199
I need to multiply [2^14 x 2^14] matrices. But first, let's talk just about [2^12 x 2^12] matrices. How many memory do I need? Assuming it's double matrices, so I need 8 bytes for one element.
((2^12 * 2^12 * 8) / 2^20) * 3 = 384 MiB
It's the worst case, when I need to store all three matrices in memory.
How many memory does haskell need? Let's see.
-> let m n = matrix (2^n) (2^n) ( \(i, j) -> (fromIntegral i) * (fromIntegral j) ) :: Matrix Double
-> let p n = let r = m n in multStd2 r r ! (n,n)
-> p 12
3.299742941184e12
(4.84 secs, 1,991,404,304 bytes)
It's ~2 GiB. Why so bad and what should I do except using C++?
UPD:
I'm using standard haskell's Data.Matrix module.
https://hackage.haskell.org/package/matrix
Upvotes: 0
Views: 544
Reputation: 7088
Why so bad and what should I do except using C++.
Make sure you're using -O2
. In my tests it reduced the maximum memory
residence from 1095M to 544M. Runtime also reduces down to 2.65s from 3.48s.
Try to use a more efficient library, as also mentioned in the comments. Try
hmatrix
, for example.
Now, as for the "why". Here's how Matrix
type is implemented in matrix
library:
data Matrix a = M {
nrows :: {-# UNPACK #-} !Int -- ^ Number of rows.
, ncols :: {-# UNPACK #-} !Int -- ^ Number of columns.
, rowOffset :: {-# UNPACK #-} !Int
, colOffset :: {-# UNPACK #-} !Int
, vcols :: {-# UNPACK #-} !Int -- ^ Number of columns of the matrix without offset
, mvect :: V.Vector a -- ^ Content of the matrix as a plain vector.
}
(Assuming we're on a 64bit system)
We have 5 unpacked integers, which makes 5 x 8 bytes = 40 bytes
. For the
actual vector data we have a Vector
. It's not an unboxed
vector,
which means every element of the vector will be a pointer to a heap allocated
object. In our case, we have these functions:
m :: Int -> Matrix Double
m n =
matrix (2^n) (2^n) ( \(i, j) -> (fromIntegral i) * (fromIntegral j) )
p :: Int -> Double
p n = let r = m n in multStd2 r r ! (n,n)
(types are added to make it clear)
So we have a Matrix Double
of size 2^12 * 2^12 = 16777216
doubles.
Now to estimate how much memory it'll take to store that many doubles in the heap, let's look at GHC's heap object representation.
A heap object is represented as this:
typedef struct StgClosure_ {
StgHeader header;
struct StgClosure_ *payload[FLEXIBLE_ARRAY];
} *StgClosurePtr;
typedef struct {
const StgInfoTable* info;
} StgHeader;
typedef struct StgInfoTable_ {
StgClosureInfo layout; /* closure layout info (one word) */
StgHalfWord type; /* closure type */
StgHalfWord srt_bitmap;
StgCode code[FLEXIBLE_ARRAY];
} *StgInfoTablePtr;
typedef union {
struct { /* Heap closure payload layout: */
StgHalfWord ptrs; /* number of pointers */
StgHalfWord nptrs; /* number of non-pointers */
} payload;
StgWord bitmap; /* word-sized bit pattern describing */
/* a stack frame: see below */
OFFSET_FIELD(large_bitmap_offset); /* offset from info table to large bitmap structure */
StgWord selector_offset; /* used in THUNK_SELECTORs */
} StgClosureInfo;
We have quite a lot of bookkeeping here. To make things simpler, let's assume
the matrix
library is efficient enough to avoid thunking here(i.e. matrix
function is strict in it's return value). In this case we can say that our
payload
will only have the actual double. So here's the math:
16777216 doubles.
- 16777216 struct StgClosure_ *
for Double type = 16777216 * 8 bytes.
- Our payload will only have doubles = 16777216 * 8 bytes.
StgInfoTable*
= 16777216 * 8 bytes.StgInfoTable
consist of:
StgClosureInfo
= at least 3 words = 24 bytes.When we add up these it makes 1073M.
Now, the match was pretty hand-wavy but hopefully it's clear that for every heap object there's a lot of bookkeeping going on. To be more precise we'd need to take things like shared info tables(e.g. as far as I know same closures would have same info tables?) into account, I think.
Upvotes: 3
Reputation: 15029
Two points.
You are looking for "unboxed" arrays (or vectors or matrices or whatever), not "boxed" ones. Unboxed arrays have the same flat representation as C, while boxed arrays are arrays of pointers to heap objects that are stored outside the array. In exchange for the additional space usage, boxed arrays can store any type of element and can also store unevaluated values, making them non-strict in their contents. For a type the same size as a machine word (assuming a 64-bit system) like Double
, a boxed array will normally use three times as much space as an unboxed array.
The number "1,991,404,304 bytes" does not mean what you seem to think it means. It is quite literally the total amount of memory allocated during the computation. This does not say anything about the maximum amount of space used at any one time (except that it is less than 2GB). For peak space usage values, run with +RTS -s
and look at the "maximum residency" value in the output when the program finishes.
Upvotes: 5