nh2
nh2

Reputation: 25665

How to store a Haskell data type in an Unboxed Vector in continuous memory

I would like to store a non-parametric, unpacked data type like

data Point3D = Point3D {-# UNPACK #-} !Int {-# UNPACK #-} !Int {-# UNPACK #-} !Int

In an Unboxed vector. Data.Vector.Unboxed says:

In particular, unboxed vectors of pairs are represented as pairs of unboxed vectors.

Why is that? I would prefer to have my Point3D laid out one after another in memory to get fast cache-local access when sequentially iterating over them - the equivalent of mystruct[1000] in C.

Using Vector.Unboxed or otherwise, how can I achieve that?


By the way: With vector-th-unbox the same happens, since with that you just transform your data type to the (Unbox a, Unbox b) => Unbox (a, b) instance.

Upvotes: 6

Views: 1009

Answers (1)

user2407038
user2407038

Reputation: 14578

I don't know why vectors of pairs are stored as pairs of vectors, but you can easily write instances for your datatype to store the elements sequentially.

{-# LANGUAGE TypeFamilies, MultiParamTypeClasses #-}

import qualified Data.Vector.Generic as G 
import qualified Data.Vector.Generic.Mutable as M 
import Control.Monad (liftM, zipWithM_)
import Data.Vector.Unboxed.Base

data Point3D = Point3D {-# UNPACK #-} !Int {-# UNPACK #-} !Int {-# UNPACK #-} !Int

newtype instance MVector s Point3D = MV_Point3D (MVector s Int)
newtype instance Vector    Point3D = V_Point3D  (Vector    Int)
instance Unbox Point3D

At this point the last line will cause an error since there are no instances for vector types for Point3D. They can be written as follows:

instance M.MVector MVector Point3D where 
  basicLength (MV_Point3D v) = M.basicLength v `div` 3 
  basicUnsafeSlice a b (MV_Point3D v) = MV_Point3D $ M.basicUnsafeSlice (a*3) (b*3) v 
  basicOverlaps (MV_Point3D v0) (MV_Point3D v1) = M.basicOverlaps v0 v1 
  basicUnsafeNew n = liftM MV_Point3D (M.basicUnsafeNew (3*n))
  basicUnsafeRead (MV_Point3D v) n = do 
    [a,b,c] <- mapM (M.basicUnsafeRead v) [3*n,3*n+1,3*n+2]
    return $ Point3D a b c 
  basicUnsafeWrite (MV_Point3D v) n (Point3D a b c) = zipWithM_ (M.basicUnsafeWrite v) [3*n,3*n+1,3*n+2] [a,b,c]

instance G.Vector Vector Point3D where 
  basicUnsafeFreeze (MV_Point3D v) = liftM V_Point3D (G.basicUnsafeFreeze v)
  basicUnsafeThaw (V_Point3D v) = liftM MV_Point3D (G.basicUnsafeThaw v)
  basicLength (V_Point3D v) = G.basicLength v `div` 3
  basicUnsafeSlice a b (V_Point3D v) = V_Point3D $ G.basicUnsafeSlice (a*3) (b*3) v 
  basicUnsafeIndexM (V_Point3D v) n = do 
    [a,b,c] <- mapM (G.basicUnsafeIndexM v) [3*n,3*n+1,3*n+2]
    return $ Point3D a b c 

I think most of the function definitions are self explanatory. The vector of points is stored as a vector of Ints and the nth point is the 3n,3n+1,3n+2 Ints.

Upvotes: 9

Related Questions