devgeek
devgeek

Reputation: 360

Haskell API client pagination

I'm trying to build a Haskell client to consume a RESTful JSON API. I want to fetch a page, then take the "next_page" key from the response, and feed it into the query params of another api request. I want to continue doing this until I have paged through all the items. What is the cleanest way of doing this in Haskell? I'm using explicit recursion now, but I feel there must be a better way, maybe with the writer or state monad.

EDIT: Added my code. I'm aware I'm misusing fromJust.

data Post = Post {
          title  :: !Text,
          domain :: !Text,
          score  :: Int,
          url    :: !Text
} deriving (Show)

instance FromJSON Post where
  parseJSON (Object v) = do
    objectData <- v .: "data"
    title  <- objectData .: "title"
    domain <- objectData .: "domain"
    score  <- objectData .: "score"
    url    <- objectData .: "url"
    return $ Post title domain score url

data GetPostsResult = GetPostsResult {
  posts :: [Post],
  after :: Maybe Text
} deriving (Show)

instance FromJSON GetPostsResult where
  parseJSON (Object v) = do
    rootData   <- v        .:  "data"
    posts      <- rootData .:  "children"
    afterCode <- rootData .:? "after"
    return $ GetPostsResult posts afterCode

fetchPage:: Text -> IO (Maybe ([Post],Maybe Text))
fetchPage afterCode = do
  let url = "https://www.reddit.com/r/videos/top/.json?sort=top&t=day&after=" ++ unpack afterCode
  b <- get url
  let jsonBody = b ^. responseBody
  let postResponse = decode jsonBody :: Maybe GetPostsResult
  let pagePosts = posts <$>  postResponse
  let nextAfterCode = after $ fromJust postResponse
  if isNothing pagePosts then return Nothing else return (Just (fromJust pagePosts,nextAfterCode))

getPosts :: Text -> [Post] -> IO [Post]
getPosts x y = do
  p <- liftIO $ fetchPage x
  let posts = fst (fromJust p)
  let afterParam =  snd (fromJust p)
  case afterParam of
    Nothing ->  return []
    Just aff -> getPosts aff (posts ++ y)

main = do
  a <- getPosts "" []
  print a

Upvotes: 1

Views: 604

Answers (2)

Markus1189
Markus1189

Reputation: 2869

Your approach is certainly appealing due to it's simplicity, however it has the disadvantage, that the list of Posts will only be available after the whole pagination chain has ended.

I would suggest to use a streaming library like Pipe, Conduit and friends. The advantage is that you can stream the results and also limit the number of posts to retrieve, by using functions provided by the respective streaming library. Here is an example with Pipe, I added the necessary imports and added postsP:

{-# LANGUAGE OverloadedStrings #-}
import Data.Aeson
import Data.Text (Text)
import qualified Data.Text as T
import Network.Wreq
import Data.Maybe
import Control.Lens
import Control.Monad.IO.Class
import qualified Pipes as P
import Data.Foldable (for_)

data Post = Post {
          title  :: !Text,
          domain :: !Text,
          score  :: Int,
          url    :: !Text
} deriving (Show)

instance FromJSON Post where
  parseJSON (Object v) = do
    objectData <- v .: "data"
    title  <- objectData .: "title"
    domain <- objectData .: "domain"
    score  <- objectData .: "score"
    url    <- objectData .: "url"
    return $ Post title domain score url

data GetPostsResult = GetPostsResult {
  posts :: [Post],
  after :: Maybe Text
} deriving (Show)

instance FromJSON GetPostsResult where
  parseJSON (Object v) = do
    rootData   <- v        .:  "data"
    posts      <- rootData .:  "children"
    afterCode <- rootData .:? "after"
    return $ GetPostsResult posts afterCode

fetchPage:: Text -> IO (Maybe ([Post],Maybe Text))
fetchPage afterCode = do
  let url = "https://www.reddit.com/r/videos/top/.json?sort=top&t=day&after=" ++ T.unpack afterCode
  b <- get url
  let jsonBody = b ^. responseBody
  let postResponse = decode jsonBody :: Maybe GetPostsResult
  let pagePosts = posts <$>  postResponse
  let nextAfterCode = after $ fromJust postResponse
  if isNothing pagePosts then return Nothing else return (Just (fromJust pagePosts,nextAfterCode))

getPosts :: Text -> [Post] -> IO [Post]
getPosts x y = do
  p <- liftIO $ fetchPage x
  let posts = fst (fromJust p)
  let afterParam =  snd (fromJust p)
  case afterParam of
    Nothing ->  return []
    Just aff -> getPosts aff (posts ++ y)

postsP :: Text -> P.Producer Post IO ()
postsP x = do
  p <- liftIO (fetchPage x)
  for_ p $ \(posts,afterParam) -> do
    P.each posts
    for_ afterParam postsP

main = P.runEffect $ P.for (postsP "") (liftIO . print)

Upvotes: 3

Paul Johnson
Paul Johnson

Reputation: 17786

I would suggest you need a continuation monad. Within the continuation you can have the logic to assemble a page, yield it to the caller of the continuation, and then repeat. When you call your "get_pages" function you will get back the first page and a new function that takes your new parameters.

It sounds like you need this answer, which shows how to create such a monad. Make it a monad transformer if you need any other monadic state inside your function.

Upvotes: 1

Related Questions