Reputation: 5525
I have a custom report in Google Analytics Dashboard. I also get this data through python googleapiclient. But the data between the dashboard and this doesn't match. There is consistently a difference of ~10% less in data points obtained through python.
Here's the format of report object.
def get_report(analytics, token):
return analytics.reports().batchGet(
body={
'reportRequests': [
{
'viewId': VIEW_ID,
'dateRanges': [{'startDate': '1daysAgo', 'endDate': '1daysAgo'}],
'metrics': [
{'expression': 'ga:users'},
........
],
'dimensions': [
{'name': 'ga:date'},
{'name': 'ga:hour'},
....
],
'pageSize': 100000,
'pageToken': token,
'samplingLevel': 'HIGH',
}]
}
).execute()
I believe sampling is not the problem since report.get('samplesReadCounts') returns None.
What could be the problem? Checked in query-explorer .Also not matching.
Upvotes: 1
Views: 139
Reputation: 116918
This is probably do to latency. You should not be trying to request yesterdays data from Google analytics most of the time the data has not finished processing for at least 24 -48 hours.
You can check this by checking the isDataGolden field in the response.
Indicates if response to this request is golden or not. Data is golden when the exact same request will not produce any new results if asked at a later point in time.
That being said it is very hard to get the reports on the website to match exactly with data returned by the api. You need to be requesting the same exact dates and dimensions and metics as the report was built on and it can be very hard to know what dimensions and metrics were used in some of the reports on the website.
Even setting sampliingLevel to HIGH does not prevent sampling.
Upvotes: 1