Reputation: 11830
I am writing simple sitemap.xml crawler. The code is below. My question is why the code in the end of main
does not print anything. I suspect it's because haskell's lazyness but don't know how to deal with it here:
import Network.HTTP.Conduit
import qualified Data.ByteString.Lazy as L
import Text.XML.Light
import Control.Monad.Trans (liftIO)
import Control.Monad
import Data.String.Utils
import Control.Exception
download :: Manager -> Request -> IO (Either HttpException L.ByteString)
download manager req = do
try $
fmap responseBody (httpLbs req manager)
downloadUrl :: Manager -> String -> IO (Either HttpException L.ByteString)
downloadUrl manager url = do
request <- parseUrl url
download manager request
getPages :: Manager -> [String] -> IO [Either HttpException L.ByteString]
getPages manager urls =
sequence $ map (downloadUrl manager) urls
main = withManager $ \ manager -> do
-- I know simpleHttp is bad here
mapSource <- liftIO $ simpleHttp "http://example.com/sitemap.xml"
let elements = (parseXMLDoc mapSource) >>= Just . findElements (mapElement "loc")
Just urls = liftM (map $ (replace "/#!" "?_escaped_fragment_=") . strContent) elements
mapElement name = QName name (Just "http://www.sitemaps.org/schemas/sitemap/0.9") Nothing
return $
getPages manager urls >>= \ pages -> do
print "evaluate me!"
sequence $ map print pages
Upvotes: 0
Views: 197
Reputation: 10783
You're running into the same problem I describe here, at least as far as having incorrect code that typechecks when it should actually give a type error: Why is the type of "Main.main", "IO ()" and not "IO a"?. This is why you should always give main
the type signature main :: IO ()
explicitly.
To fix the problem, you will want to replace return
with lift
(see http://hackage.haskell.org/package/transformers/docs/Control-Monad-Trans-Class.html#v:lift) and replace sequence $ map ...
with mapM_
. mapM_ f
is equivalent to sequence_ . map f
.
Upvotes: 2
Reputation: 34391
Substitute your last return
with runResourceT
(http://hackage.haskell.org/package/resourcet-1.1.1/docs/Control-Monad-Trans-Resource.html#v:runResourceT). As it's type suggests, it would turn ResourceT into IO action.
Upvotes: 2