Snowball
Snowball

Reputation: 11686

Decompressing GZip in Haskell

I'm having a hard time figuring this out. Here's what I'm trying:

ghci> :m +System.FileArchive.GZip  -- From the "MissingH" package
ghci> fmap decompress $ readFile "test.html.gz"
*** Exception: test.html.gz: hGetContents: invalid argument (invalid byte sequence)

Why am I getting that exception?

I also tried Codec.Compression.GZip.decompress from the zlib package, but I can't get the types to work out to String instead of ByteString.

Upvotes: 6

Views: 1091

Answers (1)

hammar
hammar

Reputation: 139840

The conversion from ByteString to String depends on the character encoding of the compressed file, but assuming it's ASCII or Latin-1, this should work:

import Codec.Compression.GZip (decompress)
import qualified Data.ByteString.Lazy as LBS
import Data.ByteString.Lazy.Char8 (unpack)

readGZipFile :: FilePath -> IO String
readGZipFile path = fmap (unpack . decompress) $ LBS.readFile path

If you need to work with some other encoding like UTF-8, replace unpack with an appropriate decoding function, e.g. Data.ByteString.Lazy.UTF8.toString.

Of course, if the file you're decompressing isn't a text file, it's better to keep it as a ByteString.

Upvotes: 9

Related Questions