Reputation: 16059
I'm working with large data arrays in the sciences, encoded into binary FITS files. For simplicity, let's say I have 2GB of 32 bit Floats saved into a file, intended to be read as a 2d array, and that I load the file as a lazy bytestring
inp <- BL.readFile "myfile.fits"
How can I parse this file into a delayed Massiv array (Array DS Ix2 Float
or Array D Ix2 Float
)?
I want to avoid calling compute
until the end of my operation
I can't seem to find a simple example of loading a Massiv array from a ByteString (which seems like it should be a common use-case). I can load it by evaluating the entire ByteString using sreplicateM
to parse to a Data.Massiv.Vector
, then resizeM
to get it into an Ix2
. However, resizeM
requires me to call compute
first.
Can anyone point me to an example of how to load a Lazy ByteString into a Massiv array and delay computation?
Feel free to point out that parsing to D
or DS
is the wrong choice. My real question is how to parse binary data from a file into an Array
and allow the user to call size
slice
, :>
, !
etc without evaluating the whole thing
Upvotes: 2
Views: 74
Reputation: 50864
I think you want to use a DL
array here, since -- unlike DS
-- DL
supports multidimensional arrays, and unlike D
, it gives you control over the writing process (i.e., you choose the indices as you load the matrix from the file).
A self-contained example follows:
import Control.Applicative
import qualified Data.Binary.Get as Bin
import qualified Data.ByteString.Lazy as BL
import Data.Massiv.Array
main :: IO ()
main = do
-- for real code, use lazy I/O for the input:
-- input <- BL.readFile "input.txt"
-- for this example, consider a 2x2 matrix in a lazy bytestring
let input :: BL.ByteString
input = BL.pack $ [ 0,0,0x80,0x3f -- 1
, 0,0,0x00,0x40 -- 2
, 0,0,0x40,0x40 -- 3
, 0,0,0x80,0x40 -- 4
]
nrow = 2
ncol = 2
-- use "binary" package to parse the floats into a lazy list
let floats = Bin.runGet (many Bin.getFloathost) input
-- create a DL array
array :: Array DL Ix2 Float
array = makeLoadArrayS (Sz (nrow :. ncol)) 0.0 $ \writer ->
sequence_ $ Prelude.zipWith writer indices floats
where indices = [(row :. col) | row <- [0..nrow-1], col <- [0..ncol-1]]
-- compute it
let array' :: Array P Ix2 Float
array' = compute array
print array'
Upvotes: 0