Reputation: 2908
I noticed something I don't understand while trying to get the number of open issues per repository for a user.
When I use the following query I am asked to perform pagination (as expected) -
query {
user(login:"armsp"){
repositories{
nodes{
name
issues(states: OPEN){
totalCount
}
}
}
}
}
The error message after running the above -
{
"data": {
"user": null
},
"errors": [
{
"type": "MISSING_PAGINATION_BOUNDARIES",
"path": [
"user",
"repositories"
],
"locations": [
{
"line": 54,
"column": 5
}
],
"message": "You must provide a `first` or `last` value to properly paginate the `repositories` connection."
}
]
}
However when I do the following I actually get all the results which doesn't make any sense to me -
query {
user(login:"armsp"){
repositories{
totalCount
}
repositories{
nodes{
name
issues(states: OPEN){
totalCount
}
}
}
}
}
Shouldn't I be asked for pagination in the second query too ?
Upvotes: 6
Views: 1983
Reputation: 84837
TLDR; This appears to be a bug. There's no way to bypass the limit applied when fetching a list of resources.
Limiting responses like this is a common feature of public APIs -- if the response could include thousands or millions of results, it'll tie up a lot of server resources to fulfill it all at once. Allowing users to make those sort of queries is both costly and a potential security risk.
Github's intent appears to be to always limit the amount of results when fetching a list of resources. This isn't well documented on the GraphQL side, but matches the behavior of their REST API:
Requests that return multiple items will be paginated to 30 items by default. You can specify further pages with the
?page
parameter. For some resources, you can also set a custom page size up to 100 with the?per_page
parameter.
For connections, it looks like the check for the first
or last
parameter is only ran whenever the nodes
field is present in the selection set. This makes sense, since this is ultimately the field we want to limit -- requesting other fields like totalDiskUsage
or totalDiskUsage
, even without a limit argument, is harmless with the regard to above concerns.
Things get funky when you consider how GraphQL handles selection sets with selections that have the same name. Without getting into the nitty gritty details, GraphQL will let you request the same field multiple times. If the field in question has a selection set, it will effectively merge the selection sets into a single one. So
query {
user(login:"armsp") {
repositories {
totalCount
}
repositories {
totalDiskUsage
}
}
}
becomes and is equivalent to
query {
user(login:"armsp") {
repositories {
totalCount
totalDiskUsage
}
}
}
Side note: The above does not hold true if you explicitly give one of the fields an alias since then the two fields have different response names.
All that to say, technically this query:
query {
user(login:"armsp"){
repositories{
totalCount
}
repositories{
nodes{
name
issues(states: OPEN){
totalCount
}
}
}
}
}
should also blow up with the same MISSING_PAGINATION_BOUNDARIES
error. The fact that it doesn't means the selection set merging is somehow borking the check that's in place. This is clearly a bug. However, even while this appears to "work", it still doesn't get around whatever limits Github has applies at the storage layer -- you will always get at most 100 results even when exploiting the above bug.
Upvotes: 2