E. Faslo
E. Faslo

Reputation: 375

Use of pageToken with Google Analytics Reporting API v4 and Python

I have followed a tutorial on how to download data from Google Analytics with Python using GA Reporting API. I was able to query the data I wanted, although reaching the rows limit. I saw in the documentation that there is a pageToken to avoid the issue. I have added this field to my request (as describe in the documentation), but I am not able to make it work.

sample_request = {
      'viewId': '12345678',
      'dateRanges': {
          'startDate': datetime.strftime(datetime.now() - timedelta(days = 30),'%Y-%m-%d'),
          'endDate': datetime.strftime(datetime.now(),'%Y-%m-%d')
      },
      'dimensions': [
          {'name': 'ga:date'},
          {'name': 'ga:dimension7'},
          {'name': 'ga:dimension6'},
          {'name': 'ga:dimension9'}
      ],
      'metrics': [
          {'expression': 'ga:users'},
          {'expression': 'ga:totalevents'}
      ],
      "pageSize": 100000,
      'pageToken': 'abc'
    }

response = api_client.reports().batchGet(
      body={
        'reportRequests': sample_request
      }).execute()

Upvotes: 2

Views: 6287

Answers (2)

Tobi
Tobi

Reputation: 1902

I solved it like this

def handle_report(analytics,pagetoken,rows):  

    response = get_report(analytics, pagetoken)

    columnHeader = response.get("reports")[0].get('columnHeader', {})
    dimensionHeaders = columnHeader.get('dimensions', [])
    metricHeaders = columnHeader.get('metricHeader', {}).get('metricHeaderEntries', [])

    pagetoken = response.get("reports")[0].get('nextPageToken', None)
    rowsNew = response.get("reports")[0].get('data', {}).get('rows', [])
    rows = rows + rowsNew
    print("len(rows): " + str(len(rows)))

    if pagetoken != None:
        return handle_report(analytics,pagetoken,rows)
    else:
        return rows

def main():    
    analytics = initialize_analyticsreporting()

    global dfanalytics
    dfanalytics = []

    rows = []
    rows = handle_report(analytics,'0',rows)

    dfanalytics = pd.DataFrame(list(rows))

Upvotes: 0

CharlieH
CharlieH

Reputation: 1542

You will hit the limit, but the parameter nextPageToken will allow you to page through multiple rows. For example:

def processReport (self, aDimensions):
    """Get a full report, returning the rows"""

    # Get the first set
    oReport   = self.getReport(aDimensions)
    oResponse = self.getResponse(oReport, True)
    aRows     = oResponse.get('rows')

    # Add any additional sets
    while oResponse.get('nextPageToken') != None:
        oResponse = self.getReport(aDimensions, oResponse.get('nextPageToken'))
        oResponse = self.getResponse(oResponse, False)
        aRows.extend(oResponse.get('rows'))

    return aRows

You can see the complete program here: https://github.com/aiqui/ga-download

Upvotes: 5

Related Questions