HW-Scientist
HW-Scientist

Reputation: 434

How to fetch a large dataset in R from Google Analytics?

When I fetch a small data set, say 2000 observations, in R using googleAnalyticsR from google analytics, everything works well.

    df <- google_analytics(id=ga_id,
                           start="2017-12-01",
                           end="2017-12-31",
                           metrics="ga:users",
                           dimensions="ga:dimension1, ga:longitude, ga:latitude",
                           max=10000)  

But when I needed to fetch a bigger data set with 20000 observations, the same code failed and error returned:

Batching data into [2] calls.
Request to profileId: ()
Error in f(content, ...) : Invalid dimension or metric:

How can I solve this issue? Thank you.

Upvotes: 3

Views: 447

Answers (2)

MarkeD
MarkeD

Reputation: 2631

You need to set max to -1, then it fetches all results. You don't need to set batches or page sizes etc. , it does that for you.

Here are some examples from the website:

# 1000 rows only
thousand <- google_analytics(ga_id, 
                             date_range = c("2017-01-01", "2017-03-01"), 
                             metrics = "sessions", 
                             dimensions = "date")

# 2000 rows
twothousand <- google_analytics(ga_id, 
                             date_range = c("2017-01-01", "2017-03-01"), 
                             metrics = "sessions", 
                             dimensions = "date",
                             max = 2000)  

# All rows
alldata <- google_analytics(ga_id, 
                             date_range = c("2017-01-01", "2017-03-01"), 
                             metrics = "sessions", 
                             dimensions = "date",
                             max = -1)  

Upvotes: 2

SKD
SKD

Reputation: 68

There is a provision to run your code in batches. I use 'rga' library and I download huge data in batches, and data frame that comes out usually has all observations. Here is a slight modification. Please let me know if it doesn't work.

df <- ga$getData(id, batch =TRUE,
                           start="2017-01-01",
                           end="2017-12-31",
                           metrics="ga:users",
                           dimensions="ga:dimension1, ga:longitude,ga:latitude",
                           max=10000)  

It is from a git version of the library. Very sorry I did not mention this earlier. I use this so much, I forgot it isn't part of the CRAN version.

Upvotes: 1

Related Questions