Sean Clark Hess
Sean Clark Hess

Reputation: 16059

Example of Massiv Multidimensional Array from a lazy ByteString?

I'm working with large data arrays in the sciences, encoded into binary FITS files. For simplicity, let's say I have 2GB of 32 bit Floats saved into a file, intended to be read as a 2d array, and that I load the file as a lazy bytestring

inp <- BL.readFile "myfile.fits"

How can I parse this file into a delayed Massiv array (Array DS Ix2 Float or Array D Ix2 Float)?

I want to avoid calling compute until the end of my operation

I can't seem to find a simple example of loading a Massiv array from a ByteString (which seems like it should be a common use-case). I can load it by evaluating the entire ByteString using sreplicateM to parse to a Data.Massiv.Vector, then resizeM to get it into an Ix2. However, resizeM requires me to call compute first.

Can anyone point me to an example of how to load a Lazy ByteString into a Massiv array and delay computation?

Feel free to point out that parsing to D or DS is the wrong choice. My real question is how to parse binary data from a file into an Array and allow the user to call size slice, :>, ! etc without evaluating the whole thing

Upvotes: 2

Views: 74

Answers (1)

K. A. Buhr
K. A. Buhr

Reputation: 50864

I think you want to use a DL array here, since -- unlike DS -- DL supports multidimensional arrays, and unlike D, it gives you control over the writing process (i.e., you choose the indices as you load the matrix from the file).

A self-contained example follows:

import Control.Applicative
import qualified Data.Binary.Get as Bin
import qualified Data.ByteString.Lazy as BL
import Data.Massiv.Array

main :: IO ()
main = do

  -- for real code, use lazy I/O for the input:
  --   input <- BL.readFile "input.txt"

  -- for this example, consider a 2x2 matrix in a lazy bytestring
  let input :: BL.ByteString
      input = BL.pack $ [ 0,0,0x80,0x3f -- 1
                        , 0,0,0x00,0x40 -- 2
                        , 0,0,0x40,0x40 -- 3
                        , 0,0,0x80,0x40 -- 4
                        ]
      nrow = 2
      ncol = 2

  -- use "binary" package to parse the floats into a lazy list
  let floats = Bin.runGet (many Bin.getFloathost) input

  -- create a DL array
      array :: Array DL Ix2 Float
      array = makeLoadArrayS (Sz (nrow :. ncol)) 0.0 $ \writer ->
        sequence_ $ Prelude.zipWith writer indices floats
        where indices = [(row :. col) | row <- [0..nrow-1], col <- [0..ncol-1]]

  -- compute it
  let array' :: Array P Ix2 Float
      array' = compute array
  print array'

Upvotes: 0

Related Questions