Grant Watters
Grant Watters

Reputation: 670

Drive API files.list returning nextPageToken with empty item results

In the last week or so we got a report of a user missing files in the file list in our app. We we're a bit confused at first because they said they only had a couple files that matched our query string, but with a bit of work we were able to reproduce their issue by adding a large number of files to our Google Drive. Previously we had been assuming people would have less than 100 files and hadn't been doing paging to avoid multiple files.list requests.

After switching to use paging, we noticed that on one of our test accounts was sending hundreds and hundreds of files.list requests and most of the responses did not contain any files but did contain a nextPageToken. I'll update as soon as I can get a screenshot - but the client was sending enough requests to heat the computer up and drain battery fairly quickly.

We also found that based on what the query is even though it matches the same files it can have a drastic effect of the number of requests needed to retrieve our full file list. For example, switching '=' to 'contains' in the query param significantly reduces the number of requests made, but we don't see any guarantee that this is a reasonable and generalizeable solution.

Is this the intended behavior? Is there anything we can do to reduce the number of requests that we are sending?

We're using the following code to retrieve files created by our app that is causing the issue.

runLoad: function(pageToken)
{
    gapi.client.drive.files.list(
    {
        'maxResults': 999,
        'pageToken': pageToken,
        'q': "trashed=false and mimeType='" + mime + "'"
    }).execute(function (results)
    {
        this.filePageRequests++;

        if (results.error || !results.nextPageToken || this.filePageRequests >= MAX_FILE_PAGE_REQUESTS)
        {
            this.isLoading(false);
        }
        else
        {
            this.runLoad(results.nextPageToken);
        }
    }.bind(this));
}

Upvotes: 3

Views: 1428

Answers (1)

pinoyyid
pinoyyid

Reputation: 22316

It is, but probably shouldn't be, the correct behaviour.

It generally occurs when using the drive.file scope. What (I think) is happening is that the API layer is fetching all files, and then removing those that are outside of the current scope/query, and returning the remainder to your client app. In theory, a particular page of files could have no files in-scope, and so the returned array is empty.

As you've seen, it's a horribly inefficient way of doing it, but that seems to be the way it is. You simply have to keep following the next page link until it's null.

As to "Is there anything we can do to reduce the number of requests that we are sending?"

You're already setting max results to 999 which is the obvious step. Just be aware that I have seen this value trigger internal errors (timeouts?) which manifest themselves as 500 errors. You might want to sacrifice efficiency for reliability and stick to the default of 100 which seems to be better tested.

I don't know if the code you posted is your actual code, or just a simplified illustration, but you need to make sure you are dealing with 401 errors (auth expiry) and 500 errors (sometimes recoverable with a retry)

Upvotes: 4

Related Questions