Reputation: 2052
I'm trying to convert some working Ruby code to Clojure which calls a paginated REST API and accumulates the data. The Ruby code, basically calls the API initially, checks if there's pagination.hasNextPage
keys, and uses the pagination.endCursor
as a query string parameter for the next APIs calls which are done in while
loop. Here's the simplified Ruby code (logging/error handling code removed, etc.):
def request_paginated_data(url)
results = []
response = # ... http get url
response_data = response['data']
results << response_data
while !response_data.nil? && response.has_key?('pagination') && response['pagination'] && response['pagination'].has_key?('hasNextPage') && response['pagination']['hasNextPage'] && response['pagination'].has_key?('endCursor') && response['pagination']['endCursor']
response = # ... http get url + response['pagination']['endCursor']
response_data = response['data']
results << response_data
end
results
end
Here's the beginnings of my Clojure code:
(defn get-paginated-data [url options]
{:pre [(some? url) (some? options)]}
(let [body (:body @(client/get url options))]
(log/debug (str "body size =" (count body)))
(let [json (json/read-str body :key-fn keyword)]
(log/debug (str "json =" json))))
;; ???
)
I know I can look for a key in the json clojure.lang.PersistentArrayMap
using contains?
, however, I'm not sure how to write the rest of the code...
Upvotes: 2
Views: 886
Reputation: 628
Clojure 1.11 introduce a new function iteration that is exactly built for pagination.
Also this article explains it really well https://www.juxt.pro/blog/new-clojure-iteration
Upvotes: 1
Reputation: 2052
Here's the end result after applying recommendations from Stefan Kamphausen and Alan Thompson:
(defn get-paginated-data [^String url ^clojure.lang.PersistentArrayMap options ^clojure.lang.Keyword data-key]
{:pre [(some? url) (some? options)]}
(loop [results [] u url page 1]
(log/debugf "Requesting data from API url=%s page=%d" u page)
(let [body (:body @(client/get u options))
body-map (json/read-str body :key-fn keyword)
data (get-in body-map [data-key])
has-next-page (get-in body-map [:pagination :hasNextPage])
end-cursor (get-in body-map [:pagination :endCursor])
accumulated-results (into results data)
continue? (and has-next-page (> (count end-cursor) 0))]
(log/debugf "count body=%d" (count body))
(log/debugf "count results=%s" (count results))
(log/debugf "has-next-page=%s" has-next-page)
(log/debugf "end-cursor=%s" end-cursor)
(log/debugf "continue?=%s" continue?)
(if continue?
(let [next-url (str url "?after=" end-cursor)]
(log/info (str "Sleeping for " (/ pagination-delay 1000) " seconds..."))
(Thread/sleep pagination-delay)
(recur accumulated-results next-url (inc page)))
accumulated-results))))
Upvotes: -1
Reputation: 1665
In the past, I've used loop
and recur
for such things.
Here's an example for querying the Jira API:
(defn get-jira-data [from to url hdrs]
(loop [result []
n 0
api-offset 0]
(println n " " (count result) " " api-offset)
(let [body (jql-body from to api-offset)
resp (client/post url
{:headers hdrs
:body body})
issues (-> resp
:body
(json/read-str :key-fn keyword)
:issues)
returned-count (count issues)
intermediate-res (into result issues)]
(if (and (pos? returned-count)
(< (inc n) MAX-PAGED-PAGES))
(recur intermediate-res
(inc n)
(+ api-offset returned-count))
intermediate-res)))))
I can recommend limiting the recursion to a maximum number of pages to avoid unforeseen and unpleasant surprises in production. With the Jira API, you can send the offset or page you want for the next iteration in the body of the request. If you work for example with the GitHub API, you'd need a local binding in the loop
call for the URL.
Talking about the GitHub API: They ship the relevant URLs as HTTP headers in the response. You can use them like this:
(loop [result []
u url
n 0]
(log/debugf "Get JSON paged (%s previous results) from %s"
(count result) u)
(let [resp (http-get-with-retry u {:headers auth-hdr})
data (-> resp :body
(json/read-str :key-fn keyword))
intermediate-res (into result data)
next-url (-> resp :links :next :href)]
(if (and next-url
data
(pos? (count data))
(<= n MAX-PAGED-PAGES))
(recur intermediate-res next-url (inc n))
intermediate-res))
You need to extrapolate missing functions and other Vars here. http-get-with-retry
is essentially just an HTTP GET with a retry handler function added. The pattern is the same as you can see, it just uses the respective links from the response and a local url
binding.
I hereby put all the above code under the Apache Software License 2.0 in addition to the standard license(s) on StackOverflow
Upvotes: 0
Reputation: 29984
You probably want something like:
(let [data (json/read-str body :key-fn keyword)
hnp (get-in data [:pagination :hasNextPage])
ec (get-in data [:pagination :endCursor])
continue? (and hnp ec) ]
(println :hnp hnp)
(println :ec ec)
(println :cont continue?)
...)
to pull out the nested bits and print some debugging info. Double-check that the json-to-clojure conversion got the "CamelCase" keywords as expected, and modify to match if necessary.
You may find my favorite template project helpful, especially the list of documentation at the end. Be sure to read the Clojure CheatSheet!
Upvotes: 1