Reputation: 7948
I want to generate a plot of interest over time using GTrendsR and ggplot2
The plot I want (generated with google trends) is this:
Any help will be much appreciated.
Thanks!
This is the best I was able to get:
library(ggplot2)
library(devtools)
library(GTrendsR)
usr = "my.email"
psw = "my.password"
ch = gConnect(usr, psw)
location = "all"
query = "MOOCs"
MOOCs_trends = gTrends(ch, geo = location, query = query)
MOOCs<-MOOCs_trends[[1]]
MOOCs$moocs<-as.numeric(as.character(MOOCs$moocs))
MOOCs$Week <- as.character(MOOCs$Week)
MOOCs$start <- as.Date(MOOCs$Week)
ggplot(MOOCs[MOOCs$moocs!=0,], aes(start, moocs)) +
geom_line(colour = "blue") +
ylab("Trends") + xlab("") + theme_bw()
I think that to match the graph generated by google I would need to aggregate the data to months instead of weeks... not sure how to do that yet
Upvotes: 2
Views: 542
Reputation: 3595
The object returned by gtrendsR is a list
, of which the trend
element in a data.frame that you would want to plot.
usr = "my.email"
psw = "my.password"
gconnect(usr, psw)
MOOCs_trends = gtrends('MOOCs')
MOOCsDF <- MOOCs_trends$trend
ggplot(data = MOOCsDF) + geom_line(aes(x=start, y=moocs))
This gives:
Now if you want to aggregate by month, I would suggest using the floor_date
function from the lubridate package, in combination with dplyr (note that I am using the chain operator %>%
which dplyr re-exports from the magrittr package).
usr = "my.email"
psw = "my.password"
gconnect(usr, psw)
MOOCs_trends = gtrends('MOOCs')
MOOCsDF <- MOOCs_trends
MOOCsDF$start <- floor_date(MOOCsDF$start, unit = 'month')
MOOCsDF %>%
group_by(start) %>%
summarise(moocs = sum(moocs)) %>%
ggplot() + geom_line(aes(x=start, y=moocs))
This gives:
Note 1: The query MOOCs
was changed to moocs
, by gtrendsR
, this is reflected in the y variable that you're plotting.
Note 2: some of the cases of functions have changed (e.g. gtrendsR
not GTrendsR
), I am using current versions.
Upvotes: 2
Reputation: 804
This will get you most of the way there. The plot doesn't look quite right, but that's more of a function of the data being a bit different. Here's the necessary conversions to numeric and to dates.
MOOCs<-MOOCs_trends[[1]]
library(ggplot2)
library(plyr)
## Convert to string
MOOCs$Week <- as.character(MOOCs$Week)
MOOCs$moocs <- as.numeric(MOOCs$moocs)
# split the string
MOOCs$start <- unlist(llply(strsplit(MOOCs$Week," - "), function(x) return(x[2])))
MOOCs$start <- as.POSIXlt(MOOCs$start)
ggplot(MOOCs,aes(x=start,y=moocs))+geom_point()+geom_path()
Google might do some smoothing, but this will plot the data you have.
Upvotes: 0