Reputation: 372
I have a Haskell project that uses fmap
on data structures very intensively. In order to avoid traversing the same data structure again and again, while retaining the possibility to use fmap
liberally, I decided to use the Coyoneda
-type to guard some of the bigger structures.
The Coyoneda type has constructor Coyoneda :: (x -> y) -> f x -> Coyoneda f y
. The idea is that by parametricity, more precisely by the co-Yoneda lemma, the types f a
and Coyoneda f a
are isomorphic, but the advantage of the Coyoneda
type is that it postpones the actual structure traversal.
For example, in the following code, the first expression traverses the underlying structure 3 times, and the second one only once:
fmap k $ fmap h $ fmap g $ (x :: f a)
lowerCoyoneda $ fmap k $ fmap h $ fmap g $ liftCoyoneda $ (x :: f a)
Indeed, the second line reduces as follows:
lowerCoyoneda $ fmap k $ fmap h $ fmap g $ liftCoyoneda $ x
lowerCoyoneda $ fmap k $ fmap h $ fmap g $ Coyoneda id x
lowerCoyoneda $ fmap k $ fmap h $ Coyoneda g x
lowerCoyoneda $ fmap k $ Coyoneda (h . g) x
lowerCoyoneda $ Coyoneda (k . h . g) x
fmap (k . h . g) x
so it effectively postpones the actual structure traversal until you apply lowerCoyoneda
.
This had a massive impact on time and a big impact on space performance, but I am still not satisfied. For this reason, I want to start using deepseq
(or similar) as (indirectly) provided by the NFData
typeclass. So I want to implement NFData
for my functors, which now have Coyoneda
-guards in their definitions.
The trouble is that reducing lambdas to normal form doesn't shrink the size of the data in their closure.
Mathematically, it would be sound to simply reduce Coyoneda g x
to Coyoneda id (fmap g x)
. I am wondering if there is some unsafe hack - some sort of runtime rewrite rule - to implement this. Other solutions are also appreciated.
Upvotes: 7
Views: 374
Reputation: 48611
Yes, you can do something like that, and yes, it's kind of ugly.
module Data.Functor.Coyoneda.NF where
import qualified Data.Functor.Coyoneda as S
import Data.IORef
import Control.DeepSeq
import System.IO.Unsafe
import Control.Exception
newtype Coyoneda f a =
UnsafeCoyonedaFromRef {unsafeCoyonedaToRef :: IORef (S.Coyoneda f a)}
newCoyonedaIO :: S.Coyoneda f a -> IO (Coyoneda f a)
newCoyonedaIO c = UnsafeCoyonedaFromRef <$> newIORef c
newCoyoneda :: S.Coyoneda f a -> Coyoneda f a
newCoyoneda = unsafePerformIO . newCoyonedaIO
unsafeReadCoyonedaIO :: Coyoneda f a -> IO (S.Coyoneda f a)
unsafeReadCoyonedaIO (UnsafeCoyonedaFromRef ref) = readIORef ref
unsafeReadCoyoneda :: Coyoneda f a -> S.Coyoneda f a
unsafeReadCoyoneda = unsafePerformIO . unsafeReadCoyonedaIO
{-| `fmap` happens under the reference, but does NOT traverse `f`. -}
instance Functor (Coyoneda f) where
fmap f c = unsafePerformIO $ do
q <- unsafeReadCoyonedaIO c
newCoyonedaIO $ fmap f q
instance (Functor f, NFData (f a)) => NFData (Coyoneda f a) where
rnf (UnsafeCoyonedaFromRef ref) = unsafePerformIO $ do
co <- readIORef ref
let fx = S.lowerCoyoneda co
-- We use evaluate because we want to be really sure the reduction to NF
-- succeeds and we don't install bottom in the IORef.
evaluate (rnf fx)
writeIORef ref (S.liftCoyoneda fx)
liftCoyoneda :: f a -> Coyoneda f a
liftCoyoneda = newCoyoneda . S.liftCoyoneda
lowerCoyoneda :: Functor f => Coyoneda f a -> f a
lowerCoyoneda = S.lowerCoyoneda . unsafeReadCoyoneda
What makes unsafeReadCoyoneda
"unsafe"? It subtly breaks referential transparency. If someone can extract the wrapped f a
, then they may be able to do something like traverse the value, forcing the supposedly hidden elements. Or if f
has a somewhat bogus fmap
that changes its structure then it's possible to observe a change from supposedly pure code (ouch).
Upvotes: 1
Reputation: 30103
The trouble is that reducing lambdas to normal form doesn't reduce the thunk in their body
Functions do not contain "thunks". They contain a pointer to immutable code, possibly along with a closure of values of captured variables. Thunks might evaluate to functions, but functions themselves are always considered to be fully evaluated.
Also, since functions contain pointers to immutable machine code, function bodies cannot be updated or modified at runtime in any way. rnf
is a bit of a misnomer, since GHC can't evaluate under function binders, thus can't reduce to normal forms in the usual sense in the theory of lambda calculi.
rnf
does exactly the same as seq
on functions, and seq
itself only does anything noteworthy if you have a thunk which reduces to a function. This is not common in Haskell code; one such example would be a lazy lookup from a structure which contains functions.
rnf
should be used sparingly, since it fully traverses data structures. The reason people usually use rnf
is to get rid of space leaks, not to (directly) make things faster.
In summary, you could try using seq
on the function part of Coyoneda
, and probably should not use rnf
, but I doubt that seq
would help in a measurable way.
Upvotes: 1