Reputation: 950
the github API sends the pagination data for the json results in the http link header:
Link: <https://api.github.com/repos?page=3&per_page=100>; rel="next",
<https://api.github.com/repos?page=50&per_page=100>; rel="last"
since the github API is not the only API using this method (i think) i wanted to ask if someone has a useful little snippet to parse the link header (and convert it to an array for example) so that i can use it for my js app.
i googled around but found nothing useful regarding how to parse pagination from json APIs
Upvotes: 46
Views: 26457
Reputation: 471
Instead of using the original parse-link-header package, another option would be @web3-storage/parse-link-header. It's the forked version of the original NPM package. The API is the same but it comes with advantages like:
Installation:
npm install @web3-storage/parse-link-header
Usage:
import { parseLinkHeader } from '@web3-storage/parse-link-header'
const linkHeader =
'<https://api.github.com/user/9287/repos?page=3&per_page=100>; rel="next", ' +
'<https://api.github.com/user/9287/repos?page=1&per_page=100>; rel="prev"; pet="cat", ' +
'<https://api.github.com/user/9287/repos?page=5&per_page=100>; rel="last"'
const parsed = parseLinkHeader(linkHeader)
console.log(parsed)
Output:
{
"next":{
"page":"3",
"per_page":"100",
"rel":"next",
"url":"https://api.github.com/user/9287/repos?page=3&per_page=100"
},
"prev":{
"page":"1",
"per_page":"100",
"rel":"prev",
"pet":"cat",
"url":"https://api.github.com/user/9287/repos?page=1&per_page=100"
},
"last":{
"page":"5",
"per_page":"100",
"rel":"last",
"url":"https://api.github.com/user/9287/repos?page=5&per_page=100"
}
}
Upvotes: 3
Reputation: 151
This is a Java function which will serve the purpose, you can find a link for the provided parameter key and parameter value. Please Note: This is something that I made for personal purpose, it might not be fool proof for your scenario, so review it and make changes accordingly
https://github.com/akshaysom/LinkExtract/blob/main/LinkExtract.java
public static String getLinkFromLinkHeaderByParamAndValue(String header, String param, String value) {
if (header != null && param != null && value != null && !"".equals(header.trim()) && !"".equals(param.trim())
&& !"".equals(value)) {
String[] links = header.split(",");
LINKS_LOOP: for (String link : links) {
String[] segments = link.split(";");
if (segments != null) {
String segmentLink = "";
SEGMENT_LOOP: for (String segment : segments) {
segment = segment.trim();
if (segment.startsWith("<") && segment.endsWith(">")) {
segmentLink = segment.substring(1, segment.length() - 1);
continue SEGMENT_LOOP;
} else {
if (segment.split("=").length > 1) {
String currentSegmentParam = segment.split("=")[0].trim();
String currentSegmentValue = segment.split("=")[1].trim();
if (param.equals(currentSegmentParam) && value.equals(currentSegmentValue)) {
return segmentLink;
}
}
}
}
}
}
}
return null;
}
Upvotes: 1
Reputation: 11716
For someone who ended up here searching for Link Header Parser in Java, you can use javax.ws.rs.core.Link
. Refer below for example:
import javax.ws.rs.core.Link
String linkHeaderValue = "<https://api.github.com/repos?page=3&per_page=100>; rel='next'";
Link link = Link.valueOf(linkHeaderValue);
Upvotes: 6
Reputation: 21
Here is a simple code to parse link header from GitHub in Java Script
var parse = require('parse-link-header');
var parsed = parse(res.headers.link);
no_of_pages = parsed.last.page;
Upvotes: 1
Reputation: 1454
Here is a Python solution to get contributors count for any github repo.
import requests
from urllib.parse import parse_qs
rsp = requests.head('https://api.github.com/repos/fabric8-analytics/fabric8-analytics-server/contributors?per_page=1')
contributors_count = parse_qs(rsp.links['last']['url'])['page'][0]
Upvotes: 0
Reputation: 155
Here is a simple javascript function that extracts the useful info from the link in a nice object notation.
var linkParser = (linkHeader) => {
let re = /<([^\?]+\?[a-z]+=([\d]+))>;[\s]*rel="([a-z]+)"/g;
let arrRes = [];
let obj = {};
while ((arrRes = re.exec(linkHeader)) !== null) {
obj[arrRes[3]] = {
url: arrRes[1],
page: arrRes[2]
};
}
return obj;
}
It will output the result like this ==>
{
"next": {
"url": "https://api.github.com/user/9919/repos?page=2",
"page": "2"
},
"last": {
"url": "https://api.github.com/user/9919/repos?page=10",
"page": "10"
}
}
Upvotes: 5
Reputation: 1463
Here's a simple bash script with curl and sed to get all pages from a long query
url="https://api.github.com/repos/$GIT_USER/$GIT_REPO/issues"
while [ "$url" ]; do
echo "$url" >&2
curl -Ss -n "$url"
url="$(curl -Ss -I -n "$url" | sed -n -E 's/Link:.*<(.*?)>; rel="next".*/\1/p')"
done > issues.json
Upvotes: 3
Reputation: 12587
I completely understand this is "technically" a JavaScript
thread. But, if you're like me and arrived here by Google'ing "how to parse Link header" I thought I'd share my solution for my envinronment (C#).
public class LinkHeader
{
public string FirstLink { get; set; }
public string PrevLink { get; set; }
public string NextLink { get; set; }
public string LastLink { get; set;}
public static LinkHeader FromHeader(string linkHeader)
{
LinkHeader linkHeader = null;
if (!string.IsNullOrWhiteSpace(linkHeader))
{
string[] linkStrings = linkHeader.Split("\",");
if (linkStrings != null && linkStrings.Any())
{
linkHeader = new LinkHeader();
foreach (string linkString in linkStrings)
{
var relMatch = Regex.Match(linkString, "(?<=rel=\").+?(?=\")", RegexOptions.IgnoreCase);
var linkMatch = Regex.Match(linkString, "(?<=<).+?(?=>)", RegexOptions.IgnoreCase);
if (relMatch.Success && linkMatch.Success)
{
string rel = relMatch.Value.ToUpper();
string link = linkMatch.Value;
switch (rel)
{
case "FIRST":
linkHeader.FirstLink = link;
break;
case "PREV":
linkHeader.PrevLink = link;
break;
case "NEXT":
linkHeader.NextLink = link;
break;
case "LAST":
linkHeader.LastLink = link;
break;
}
}
}
}
}
return linkHeader;
}
}
Testing in a console app, using GitHub's example Link header:
void Main()
{
string link = "<https://api.github.com/user/repos?page=3&per_page=100>; rel=\"next\",< https://api.github.com/user/repos?page=50&per_page=100>; rel=\"last\"";
LinkHeader linkHeader = LinkHeader.FromHeader(link);
}
Upvotes: 6
Reputation: 6644
If you can use Python and don't want to implement full specification, but need to have something what work for Github API, then here we go:
import re
header_link = '<https://api.github.com/repos?page=3&per_page=100>; rel="next", <https://api.github.com/repos?page=50&per_page=100>; rel="last"'
if re.search(r'; rel="next"', header_link):
print re.sub(r'.*<(.*)>; rel="next".*', r'\1', header_link)
Upvotes: 3
Reputation: 2080
The parse-link-header NPM module exists for this purpose; its source can be found on github under a MIT license (free for commercial use).
Installation is as simple as:
npm install parse-link-header
Usage looks like the following:
var parse = require('parse-link-header');
var parsed = parse('<https://api.github.com/repos?page=3&per_page=100>; rel="next", <https://api.github.com/repos?page=50&per_page=100>; rel="last"')
...after which one has parsed.next
, parsed.last
, etc:
{ next:
{ page: '3',
per_page: '100',
rel: 'next',
url: 'https://api.github.com/repos?page=3&per_page=100' },
last:
{ page: '50',
per_page: '100',
rel: 'last',
url: ' https://api.github.com/repos?page=50&per_page=100' } }
Upvotes: 28
Reputation: 907
I found this Gist that:
Parse Github
Links
header in JavaScript
Tested it out on the Github API and it returns an object like:
var results = {
last: "https://api.github.com/repositories/123456/issues?access_token=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX&state=open&since=2013-07-24T02%3A12%3A30.309Z&direction=asc&page=4"
next: "https://api.github.com/repositories/123456/issues?access_token=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX&state=open&since=2013-07-24T02%3A12%3A30.309Z&direction=asc&page=2"
};
Upvotes: 10
Reputation: 313
I found wombleton/link-headers on github. It appears to be made for the browser, as opposed to being an npm module, but it seems like it wouldn't be hard to modify it to work in a server-side environment. It uses pegjs to generate a real RFC 5988 parser rather than string splits, so it should work well for any link header, rather than just Github's.
Upvotes: 7
Reputation: 3030
There is a PageLinks class in the GitHub Java API that shows how to parse the Link
header.
Upvotes: 18