dremodaris
dremodaris

Reputation: 372

NFData instance for the Coyoneda type

I have a Haskell project that uses fmap on data structures very intensively. In order to avoid traversing the same data structure again and again, while retaining the possibility to use fmap liberally, I decided to use the Coyoneda-type to guard some of the bigger structures.

The Coyoneda type has constructor Coyoneda :: (x -> y) -> f x -> Coyoneda f y. The idea is that by parametricity, more precisely by the co-Yoneda lemma, the types f a and Coyoneda f a are isomorphic, but the advantage of the Coyoneda type is that it postpones the actual structure traversal.

For example, in the following code, the first expression traverses the underlying structure 3 times, and the second one only once:

fmap k $ fmap h $ fmap g $ (x :: f a)
lowerCoyoneda $ fmap k $ fmap h $ fmap g $ liftCoyoneda $ (x :: f a)

Indeed, the second line reduces as follows:

lowerCoyoneda $ fmap k $ fmap h $ fmap g $ liftCoyoneda $ x
lowerCoyoneda $ fmap k $ fmap h $ fmap g $ Coyoneda id x
lowerCoyoneda $ fmap k $ fmap h $ Coyoneda g x
lowerCoyoneda $ fmap k $ Coyoneda (h . g) x
lowerCoyoneda $ Coyoneda (k . h . g) x
fmap (k . h . g) x

so it effectively postpones the actual structure traversal until you apply lowerCoyoneda.

This had a massive impact on time and a big impact on space performance, but I am still not satisfied. For this reason, I want to start using deepseq (or similar) as (indirectly) provided by the NFData typeclass. So I want to implement NFData for my functors, which now have Coyoneda-guards in their definitions.

The trouble is that reducing lambdas to normal form doesn't shrink the size of the data in their closure.

Mathematically, it would be sound to simply reduce Coyoneda g x to Coyoneda id (fmap g x). I am wondering if there is some unsafe hack - some sort of runtime rewrite rule - to implement this. Other solutions are also appreciated.

Upvotes: 7

Views: 374

Answers (2)

dfeuer
dfeuer

Reputation: 48611

Yes, you can do something like that, and yes, it's kind of ugly.

module Data.Functor.Coyoneda.NF where

import qualified Data.Functor.Coyoneda as S
import Data.IORef
import Control.DeepSeq
import System.IO.Unsafe
import Control.Exception

newtype Coyoneda f a =
  UnsafeCoyonedaFromRef {unsafeCoyonedaToRef :: IORef (S.Coyoneda f a)}

newCoyonedaIO :: S.Coyoneda f a -> IO (Coyoneda f a)
newCoyonedaIO c = UnsafeCoyonedaFromRef <$> newIORef c

newCoyoneda :: S.Coyoneda f a -> Coyoneda f a
newCoyoneda = unsafePerformIO . newCoyonedaIO

unsafeReadCoyonedaIO :: Coyoneda f a -> IO (S.Coyoneda f a)
unsafeReadCoyonedaIO (UnsafeCoyonedaFromRef ref) = readIORef ref

unsafeReadCoyoneda :: Coyoneda f a -> S.Coyoneda f a
unsafeReadCoyoneda = unsafePerformIO . unsafeReadCoyonedaIO

{-| `fmap` happens under the reference, but does NOT traverse `f`. -}
instance Functor (Coyoneda f) where
  fmap f c = unsafePerformIO $ do
    q <- unsafeReadCoyonedaIO c
    newCoyonedaIO $ fmap f q

instance (Functor f, NFData (f a)) => NFData (Coyoneda f a) where
  rnf (UnsafeCoyonedaFromRef ref) = unsafePerformIO $ do
    co <- readIORef ref
    let fx = S.lowerCoyoneda co
    -- We use evaluate because we want to be really sure the reduction to NF
    -- succeeds and we don't install bottom in the IORef.
    evaluate (rnf fx)
    writeIORef ref (S.liftCoyoneda fx)

liftCoyoneda :: f a -> Coyoneda f a
liftCoyoneda = newCoyoneda . S.liftCoyoneda

lowerCoyoneda :: Functor f => Coyoneda f a -> f a
lowerCoyoneda = S.lowerCoyoneda . unsafeReadCoyoneda

What makes unsafeReadCoyoneda "unsafe"? It subtly breaks referential transparency. If someone can extract the wrapped f a, then they may be able to do something like traverse the value, forcing the supposedly hidden elements. Or if f has a somewhat bogus fmap that changes its structure then it's possible to observe a change from supposedly pure code (ouch).

Upvotes: 1

Andr&#225;s Kov&#225;cs
Andr&#225;s Kov&#225;cs

Reputation: 30103

The trouble is that reducing lambdas to normal form doesn't reduce the thunk in their body

Functions do not contain "thunks". They contain a pointer to immutable code, possibly along with a closure of values of captured variables. Thunks might evaluate to functions, but functions themselves are always considered to be fully evaluated.

Also, since functions contain pointers to immutable machine code, function bodies cannot be updated or modified at runtime in any way. rnf is a bit of a misnomer, since GHC can't evaluate under function binders, thus can't reduce to normal forms in the usual sense in the theory of lambda calculi.

rnf does exactly the same as seq on functions, and seq itself only does anything noteworthy if you have a thunk which reduces to a function. This is not common in Haskell code; one such example would be a lazy lookup from a structure which contains functions.

rnf should be used sparingly, since it fully traverses data structures. The reason people usually use rnf is to get rid of space leaks, not to (directly) make things faster.

In summary, you could try using seq on the function part of Coyoneda, and probably should not use rnf, but I doubt that seq would help in a measurable way.

Upvotes: 1

Related Questions