toxinlabs
toxinlabs

Reputation: 950

How to parse link header from github API

the github API sends the pagination data for the json results in the http link header:

Link: <https://api.github.com/repos?page=3&per_page=100>; rel="next",
<https://api.github.com/repos?page=50&per_page=100>; rel="last"

since the github API is not the only API using this method (i think) i wanted to ask if someone has a useful little snippet to parse the link header (and convert it to an array for example) so that i can use it for my js app.

i googled around but found nothing useful regarding how to parse pagination from json APIs

Upvotes: 46

Views: 26457

Answers (13)

Ray Jasson
Ray Jasson

Reputation: 471

Instead of using the original parse-link-header package, another option would be @web3-storage/parse-link-header. It's the forked version of the original NPM package. The API is the same but it comes with advantages like:

  • TypeScript support
  • Zero dependencies
  • No Node.js globals and ESM

Installation:

npm install @web3-storage/parse-link-header

Usage:

import { parseLinkHeader } from '@web3-storage/parse-link-header'

const linkHeader =
  '<https://api.github.com/user/9287/repos?page=3&per_page=100>; rel="next", ' +
  '<https://api.github.com/user/9287/repos?page=1&per_page=100>; rel="prev"; pet="cat", ' +
  '<https://api.github.com/user/9287/repos?page=5&per_page=100>; rel="last"'

const parsed = parseLinkHeader(linkHeader)
console.log(parsed)

Output:

{
   "next":{
      "page":"3",
      "per_page":"100",
      "rel":"next",
      "url":"https://api.github.com/user/9287/repos?page=3&per_page=100"
   },
   "prev":{
      "page":"1",
      "per_page":"100",
      "rel":"prev",
      "pet":"cat",
      "url":"https://api.github.com/user/9287/repos?page=1&per_page=100"
   },
   "last":{
      "page":"5",
      "per_page":"100",
      "rel":"last",
      "url":"https://api.github.com/user/9287/repos?page=5&per_page=100"
   }
}

Upvotes: 3

Akshay Som
Akshay Som

Reputation: 151

This is a Java function which will serve the purpose, you can find a link for the provided parameter key and parameter value. Please Note: This is something that I made for personal purpose, it might not be fool proof for your scenario, so review it and make changes accordingly

https://github.com/akshaysom/LinkExtract/blob/main/LinkExtract.java

  public static String getLinkFromLinkHeaderByParamAndValue(String header, String param, String value) {
            if (header != null && param != null && value != null && !"".equals(header.trim()) && !"".equals(param.trim())
                    && !"".equals(value)) {
    
                String[] links = header.split(",");
    
                LINKS_LOOP: for (String link : links) {
    
                    String[] segments = link.split(";");
    
                    if (segments != null) {
    
                        String segmentLink = "";
    
                        SEGMENT_LOOP: for (String segment : segments) {
                            segment = segment.trim();
                            if (segment.startsWith("<") && segment.endsWith(">")) {
    
                                segmentLink = segment.substring(1, segment.length() - 1);
                                continue SEGMENT_LOOP;
    
                            } else {
                                if (segment.split("=").length > 1) {
    
                                    String currentSegmentParam = segment.split("=")[0].trim();
                                    String currentSegmentValue = segment.split("=")[1].trim();
    
                                    if (param.equals(currentSegmentParam) && value.equals(currentSegmentValue)) {
                                        return segmentLink;
                                    }
                                }
                            }
                        }
                    }
                }
            }
            return null;
        }

Upvotes: 1

Sahil Chhabra
Sahil Chhabra

Reputation: 11716

For someone who ended up here searching for Link Header Parser in Java, you can use javax.ws.rs.core.Link. Refer below for example:

import javax.ws.rs.core.Link

String linkHeaderValue = "<https://api.github.com/repos?page=3&per_page=100>; rel='next'";
Link link = Link.valueOf(linkHeaderValue);

Upvotes: 6

Anvesh Reddy
Anvesh Reddy

Reputation: 21

Here is a simple code to parse link header from GitHub in Java Script

var parse = require('parse-link-header');
    var parsed = parse(res.headers.link);
    no_of_pages = parsed.last.page;

Upvotes: 1

Arunprasad Rajkumar
Arunprasad Rajkumar

Reputation: 1454

Here is a Python solution to get contributors count for any github repo.

import requests
from urllib.parse import parse_qs

rsp = requests.head('https://api.github.com/repos/fabric8-analytics/fabric8-analytics-server/contributors?per_page=1')
contributors_count = parse_qs(rsp.links['last']['url'])['page'][0]

Upvotes: 0

Harman
Harman

Reputation: 155

Here is a simple javascript function that extracts the useful info from the link in a nice object notation.

var linkParser = (linkHeader) => {
  let re = /<([^\?]+\?[a-z]+=([\d]+))>;[\s]*rel="([a-z]+)"/g;
  let arrRes = [];
  let obj = {};
  while ((arrRes = re.exec(linkHeader)) !== null) {
    obj[arrRes[3]] = {
      url: arrRes[1],
      page: arrRes[2]
    };
  }
  return obj;
}

It will output the result like this ==>

{
  "next": {
    "url": "https://api.github.com/user/9919/repos?page=2",
    "page": "2"
  },
  "last": {
    "url": "https://api.github.com/user/9919/repos?page=10",
    "page": "10"
  }
}

Upvotes: 5

noelbk
noelbk

Reputation: 1463

Here's a simple bash script with curl and sed to get all pages from a long query

url="https://api.github.com/repos/$GIT_USER/$GIT_REPO/issues"
while [ "$url" ]; do
      echo "$url" >&2
      curl -Ss -n "$url"
      url="$(curl -Ss -I -n "$url" | sed -n -E 's/Link:.*<(.*?)>; rel="next".*/\1/p')"
done > issues.json

Upvotes: 3

pim
pim

Reputation: 12587

I completely understand this is "technically" a JavaScript thread. But, if you're like me and arrived here by Google'ing "how to parse Link header" I thought I'd share my solution for my envinronment (C#).

public class LinkHeader
{
    public string FirstLink { get; set; }
    public string PrevLink { get; set; }
    public string NextLink { get; set; }
    public string LastLink { get; set;}

    public static LinkHeader FromHeader(string linkHeader)
    {
        LinkHeader linkHeader = null;

        if (!string.IsNullOrWhiteSpace(linkHeader))
        {
            string[] linkStrings = linkHeader.Split("\",");

            if (linkStrings != null && linkStrings.Any())
            {
                linkHeader = new LinkHeader();

                foreach (string linkString in linkStrings)
                {
                    var relMatch = Regex.Match(linkString, "(?<=rel=\").+?(?=\")", RegexOptions.IgnoreCase);
                    var linkMatch = Regex.Match(linkString, "(?<=<).+?(?=>)", RegexOptions.IgnoreCase);

                    if (relMatch.Success && linkMatch.Success)
                    {
                        string rel = relMatch.Value.ToUpper();
                        string link = linkMatch.Value;

                        switch (rel)
                        {
                            case "FIRST":
                                linkHeader.FirstLink = link;
                                break;
                            case "PREV":
                                linkHeader.PrevLink = link;
                                break;
                            case "NEXT":
                                linkHeader.NextLink = link;
                                break;
                            case "LAST":
                                linkHeader.LastLink = link;
                                break;
                        }
                    }
                }
            }
        }

        return linkHeader;
    }
}

Testing in a console app, using GitHub's example Link header:

void Main()
{
    string link = "<https://api.github.com/user/repos?page=3&per_page=100>; rel=\"next\",< https://api.github.com/user/repos?page=50&per_page=100>; rel=\"last\"";
    LinkHeader linkHeader = LinkHeader.FromHeader(link);
}

Upvotes: 6

Anton Babenko
Anton Babenko

Reputation: 6644

If you can use Python and don't want to implement full specification, but need to have something what work for Github API, then here we go:

import re
header_link = '<https://api.github.com/repos?page=3&per_page=100>; rel="next", <https://api.github.com/repos?page=50&per_page=100>; rel="last"'
if re.search(r'; rel="next"', header_link):
    print re.sub(r'.*<(.*)>; rel="next".*', r'\1', header_link)

Upvotes: 3

Cosmin
Cosmin

Reputation: 2080

The parse-link-header NPM module exists for this purpose; its source can be found on github under a MIT license (free for commercial use).

Installation is as simple as:

npm install parse-link-header

Usage looks like the following:

var parse = require('parse-link-header');
var parsed = parse('<https://api.github.com/repos?page=3&per_page=100>; rel="next", <https://api.github.com/repos?page=50&per_page=100>; rel="last"')

...after which one has parsed.next, parsed.last, etc:

{ next:
   { page: '3',
     per_page: '100',
     rel: 'next',
     url: 'https://api.github.com/repos?page=3&per_page=100' },
  last:
   { page: '50',
     per_page: '100',
     rel: 'last',
     url: ' https://api.github.com/repos?page=50&per_page=100' } }

Upvotes: 28

danriti
danriti

Reputation: 907

I found this Gist that:

Parse Github Links header in JavaScript

Tested it out on the Github API and it returns an object like:

var results = {
    last: "https://api.github.com/repositories/123456/issues?access_token=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX&state=open&since=2013-07-24T02%3A12%3A30.309Z&direction=asc&page=4"
    next: "https://api.github.com/repositories/123456/issues?access_token=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX&state=open&since=2013-07-24T02%3A12%3A30.309Z&direction=asc&page=2"
};

Upvotes: 10

Atul Varma
Atul Varma

Reputation: 313

I found wombleton/link-headers on github. It appears to be made for the browser, as opposed to being an npm module, but it seems like it wouldn't be hard to modify it to work in a server-side environment. It uses pegjs to generate a real RFC 5988 parser rather than string splits, so it should work well for any link header, rather than just Github's.

Upvotes: 7

Kevin Sawicki
Kevin Sawicki

Reputation: 3030

There is a PageLinks class in the GitHub Java API that shows how to parse the Link header.

Upvotes: 18

Related Questions