And Pos
And Pos

Reputation: 147

How to check that two files are equal in Haskell

I'm learning Haskell and I need to compare two files. I did not find a function that does this, so I coded it myself. Below is the function I came up with.

cmpFiles :: FilePath -> FilePath -> IO Bool
cmpFiles a b = withBinaryFile a ReadMode $ \ha ->
               withBinaryFile b ReadMode $ \hb ->
                 fix (\loop -> do
                   isEofA <- hIsEOF ha
                   isEofB <- hIsEOF hb

                   if | isEofA && isEofB -> return True             -- both files reached EOF
                      | isEofA || isEofB -> return False            -- only one reached EOF
                      | otherwise        -> do                      -- read content
                                              x <- hGet ha 4028     -- TODO: How to use a constant?
                                              y <- hGet hb 4028     -- TODO: How to use a constant?
                                              if x /= y
                                                then return False   -- different content
                                                else loop           -- same content, contunue...
                 )

My questions are:

  1. Is this code idiomatic? It looks very imperative rather than functional.
  2. Is this code efficient (Layz IO issues with big files, performance...)?
  3. Is there a more compact way to write it?

Upvotes: 4

Views: 703

Answers (3)

Christian Brolin
Christian Brolin

Reputation: 111

cmpFiles a b = (==) <$> readFile a <*> readFile b

Upvotes: 0

Franky
Franky

Reputation: 2376

You can even make a one-liner out of it:

cmpFiles a b = liftM2 (==) (readFile a) (readFile b)

This one is actually equivalent to Reid Barton's solution. Equivalent is not a weasel word here, if you take the definition of liftM2 from hackage

liftM2 f m1 m2 = do { x1 <- m1; x2 <- m2; return (f x1 x2) }

and insert (==) and the readFiles you are there immediately.

Laziness is your friend in haskell. The documentation of readFile states that the input is read lazily, i.e. only on demand. == is lazy, too. Thus the entire liftM22 ... reads the files only until it finds a difference.

Upvotes: 0

Reid Barton
Reid Barton

Reputation: 15009

How about

cmpFiles a b = do
    aContents <- readFile a
    bContents <- readFile b
    return (aContents == bContents)

Upvotes: 6

Related Questions