sitiposit
sitiposit

Reputation: 149

three ways to create a circular list in haskell

I'm interested in circular / infinite lists in Haskell. I read about let..in statements and where clauses and I get the feeling that they have an important part to play, but I still don't get it.

To be concrete, I've written three versions of code for an infinite list of alternating 0's and 1's. I take it that this is what is meant by a circular list in Haskell.

cyclic = let x = 0 : y 
             y = 1 : x
         in x

cyclic' = [0,1] ++ cyclic'

cyclic'' = [0,1] ++ x
    where x = cyclic''

The second one seems simplest, shortest and most natural to me, but maybe that's because I'm still not entirely comfortable with let..in and while.

Are all these three lists represented the same way?

Upvotes: 3

Views: 3696

Answers (3)

Petr
Petr

Reputation: 63389

I'd like to mention an important distinction:

cyclic' = [0,1] ++ cyclic'

cyclic'' = [0,1] ++ x
    where x = cyclic''

These two functions are recursive in the sense that the definition of the function references itself. But

cyclic = let x = 0 : y 
             y = 1 : x
         in x

is not! It uses x internally, which is recursive, but the whole cyclic isn't - in its definition there is no reference to itself. This is also why they're different when compiled into the core language.

This has some important practical implications, namely that recursive functions can't be inlined, but non-recursive can. This is why you often see definitions like

fix :: (a -> a) -> a
fix f = let x = f x in x

(from the source of Data.Function) instead of more direct

fix f = f (fix f)

(I'm not really sure why GHC doesn't do this automatically.)

Just for completeness, there are other short ways how to define cyclic:

-- recursive:
cyclic = 0 : 1 : cyclic
-- non-recursive:
cyclic = let x = 0 : 1 : x in x
cyclic = cycle [0,1]
cyclic = fix ([0,1] ++)

Update: To give an example: Let's define

-- The `const` combinator, defined explicitly so that
-- it gets inlined.
k :: a -> b -> a
k x y = x

fix, fix' :: (a -> a) -> a
fix f     = let x = f x in x
fix' f    = f (fix' f)

main = do
    print $ fix (k 1)
    print $ fix' (k 2)

So fix' is recursive, while fix isn't (the definition of fix is copied from Data.Function). What happens when we use fix'? The compiler can't inline it, because after inlining, it would get an expression that contains fix' again. Should it inline it the second time? And then the third time? Therefore, recursive functions are never inlined by design. On the other hand, fix isn't recursive, so fix (k 1) gets inlined into let x = k 1 x in x. Then the compiler inlines k, which results in let x = 1 in x, which is replaced simply by 1.

We can verify the above claim by dumping the compiled code in the core language:

$ ghc -ddump-simpl  -dsuppress-all Main.hs 
[1 of 1] Compiling Main             ( Main.hs, Main.o )

==================== Tidy Core ====================
Result size of Tidy Core = {terms: 24, types: 27, coercions: 0}

Rec {
fix'_reJ
fix'_reJ = \ @ a_c f_aeR -> f_aeR (fix'_reJ f_aeR)
end Rec }

main
main =
  >>
    $fMonadIO
    ($ (print $fShowInteger) (__integer 1))
    ($ (print $fShowInteger)
       (fix'_reJ
          (let {
             x_aeN
             x_aeN = __integer 2 } in
           \ _ -> x_aeN)))

main
main = runMainIO main

Upvotes: 8

Sassa NF
Sassa NF

Reputation: 5406

Expanding @PetrPudlak example gives further insight:

fix f = let x = f x in x

fix' f = f (fix' f)

k :: (Int -> Int) -> Int -> Int
k f 0 = 1
k f i = f $ i-1

main = do
         print $ fix k 10
         print $ fix' k 10

Compile:

ghc -ddump-simpl -dsuppress-all c.hs

==================== Tidy Core ====================
Result size = 59

k_ra0
k_ra0 =
  \ f_aa4 ds_dru ->
    case ds_dru of wild_X6 { I# ds1_drv ->
    case ds1_drv of _ {
      __DEFAULT -> $ f_aa4 (- $fNumInt wild_X6 (I# 1));
      0 -> I# 1
    }
    }

main
main =
  >>
    $fMonadIO
    ($ (print $fShowInt)
       (letrec {
          x_ah6
          x_ah6 = k_ra0 x_ah6; } in
        x_ah6 (I# 10)))
    ($ (print $fShowInt)
       (letrec {
          fix'_ah0
          fix'_ah0 = \ f_aa3 -> f_aa3 (fix'_ah0 f_aa3); } in
        fix'_ah0 k_ra0 (I# 10)))

main
main = runMainIO main

Here it is clear that the first case, fix, gets to construct a constant once, which gets reused in recursion, but the second case, fix', has to construct a new stub on every step of recursion.

Upvotes: 1

bheklilr
bheklilr

Reputation: 54068

You can easily check this yourself by compiling with the -fext-core, which will write a corresponding .hrc file for each of your source files, which contains the intermediate "core langauge" for Haskell. In this case, if we compile this code, we get the rather difficult to read code:

%module main:Main

  %rec
  {main:Main.cycliczqzq :: ghczmprim:GHCziTypes.ZMZN
                           ghczmprim:GHCziTypes.Int =
     base:GHCziBase.zpzp @ ghczmprim:GHCziTypes.Int
     (ghczmprim:GHCziTypes.ZC @ ghczmprim:GHCziTypes.Int
      (ghczmprim:GHCziTypes.Izh (0::ghczmprim:GHCziPrim.Intzh))
      (ghczmprim:GHCziTypes.ZC @ ghczmprim:GHCziTypes.Int
       (ghczmprim:GHCziTypes.Izh (1::ghczmprim:GHCziPrim.Intzh))
       (ghczmprim:GHCziTypes.ZMZN @ ghczmprim:GHCziTypes.Int)))
     main:Main.cycliczqzq};

  %rec
  {main:Main.cycliczq :: ghczmprim:GHCziTypes.ZMZN
                         ghczmprim:GHCziTypes.Int =
     base:GHCziBase.zpzp @ ghczmprim:GHCziTypes.Int
     (ghczmprim:GHCziTypes.ZC @ ghczmprim:GHCziTypes.Int
      (ghczmprim:GHCziTypes.Izh (0::ghczmprim:GHCziPrim.Intzh))
      (ghczmprim:GHCziTypes.ZC @ ghczmprim:GHCziTypes.Int
       (ghczmprim:GHCziTypes.Izh (1::ghczmprim:GHCziPrim.Intzh))
       (ghczmprim:GHCziTypes.ZMZN @ ghczmprim:GHCziTypes.Int)))
     main:Main.cycliczq};
  arot :: ghczmprim:GHCziTypes.Int =
    ghczmprim:GHCziTypes.Izh (0::ghczmprim:GHCziPrim.Intzh);
  a1rou :: ghczmprim:GHCziTypes.Int =
    ghczmprim:GHCziTypes.Izh (1::ghczmprim:GHCziPrim.Intzh);

  %rec
  {main:Main.cyclic :: ghczmprim:GHCziTypes.ZMZN
                       ghczmprim:GHCziTypes.Int =
     ghczmprim:GHCziTypes.ZC @ ghczmprim:GHCziTypes.Int arot yrov;
   yrov :: ghczmprim:GHCziTypes.ZMZN ghczmprim:GHCziTypes.Int =
     ghczmprim:GHCziTypes.ZC @ ghczmprim:GHCziTypes.Int a1rou
     main:Main.cyclic};

If we "clean up" this a bit and remove some of the ghczmprims and whatnot, we get

{cycliczqzq :: ZMZN Int = zpzp @ Int (ZC @ Int (Izh (0 :: Intzh)) (ZC @ Int (Izh (1 :: Intzh)) (ZMZN @ Int))) cycliczqzq};

{cycliczq :: ZMZN Int = zpzp @ Int (ZC @ Int (Izh (0 :: Intzh)) (ZC @ Int (Izh (1 :: Intzh)) (ZMZN @ Int))) cycliczq};

arot :: Int = Izh (0 :: Intzh);
a1rou :: Int = Izh (1 :: Intzh);

{cyclic :: ZMZN Int = ZC @ Int arot yrov;
 yrov :: ZMZN Int = ZC @ Int a1rou cyclic};

In which we can pretty easily tell that cycliczqzq and cycliczq have the exact same definition, and we can tell that they correlate to cyclic'' and cyclic' respectively. For cyclic, we can tell that it gets defined in a different manner.

EDIT:

To add the fourth definition

cyclic4 :: [Int]
cyclic4 =
    let xx = [1, 0] ++ yy
        yy = xx
    in xx

And I also renamed them all to be cyclic1 through cyclic4 for better readability. The output of -fext-core with all the garbage removed is

{cyclic4 :: ZMZN Int = zpzp @ Int (ZC @ Int (Izh (1::Intzh)) (ZC @ Int (Izh (0::Intzh)) (ZMZN @ Int))) cyclic4};
{cyclic3 :: ZMZN Int = zpzp @ Int (ZC @ Int (Izh (0::Intzh)) (ZC @ Int (Izh (1::Intzh)) (ZMZN @ Int))) cyclic3};
{cyclic2 :: ZMZN Int = zpzp @ Int (ZC @ Int (Izh (0::Intzh)) (ZC @ Int (Izh (1::Intzh)) (ZMZN @ Int))) cyclic2};

aroR :: Int = Izh (0::Intzh);
a1roS :: Int = Izh (1::Intzh);
{cyclic1 :: ZMZN Int = ZC @ Int aroR yroT;
 yroT :: ZMZN Int = ZC @ Int a1roS cyclic1};

So we can see that the last three definitions actually get turned into the same byte code.

Also, this was all compiled without optimizations on, since that made it harder to read.

Upvotes: 3

Related Questions