Reputation: 27
So sorry I'm quite new to R and have been trying to do this by myself but have been struggling.
I'm trying to do some sort of barplot or histogram of the tag 'Amateur' over the years 2007 to 2013 to show how it's changed over time.
The data set was downloaded from: https://sexualitics.github.io/ specifically looking at the hamster.csv
Here is some of the initial preprocessing of the data below.
head(xhamster) # Need to change upload_date into a date column, then add new column containing year
xhamster$upload_date<-as.Date(xhamster$upload_date,format="%d/%m/%Y")
xhamster$Year<-year(ymd(xhamster$upload_date)) #Adds new column containing just the year
xhamster$Year<-as.integer(xhamster$Year) # Changing new Year variable into an interger
head(xhamster) # Check changes made correctly
The filter for the years:
Yr2007<-xhamster%>%
filter_at(vars(Year),any_vars(.%in%c("2007")))
Yr2008<-xhamster%>%
filter_at(vars(Year),any_vars(.%in%c("2008")))
Yr2009<-xhamster%>%
filter_at(vars(Year),any_vars(.%in%c("2009")))
Yr2010<-xhamster%>%
filter_at(vars(Year),any_vars(.%in%c("2010")))
Yr2011<-xhamster%>%
filter_at(vars(Year),any_vars(.%in%c("2011")))
Yr2012<-xhamster%>%
filter_at(vars(Year),any_vars(.%in%c("2012")))
Yr2013<-xhamster%>%
filter_at(vars(Year),any_vars(.%in%c("2013")))
For example, I want to create a plot for the tag 'Amateur' in the data. Here is some of the code I have already done:
Amateur<-grep("Amateur",xhamster$channels)
Amateur_2007<-grep("Amateur", Yr2007$channels)
Amateur_2008<-grep("Amateur", Yr2008$channels)
Amateur_2009<-grep("Amateur", Yr2009$channels)
Amateur_2010<-grep("Amateur", Yr2010$channels)
Amateur_2011<-grep("Amateur", Yr2011$channels)
Amateur_2012<-grep("Amateur", Yr2012$channels)
Amateur_2013<-grep("Amateur", Yr2013$channels)
Amateur_2007 <- length(Amateur_2007)
Amateur_2008 <- length(Amateur_2008)
Amateur_2009 <- length(Amateur_2009)
Amateur_2010 <- length(Amateur_2010)
Amateur_2011 <- length(Amateur_2011)
Amateur_2012 <- length(Amateur_2012)
Amateur_2013 <- length(Amateur_2013)
Plot:
Amateur <- cbind(Amateur_2007, Amateur_2008, Amateur_2009,Amateur_2010, Amateur_2011, Amateur_2012, Amateur_2013)
barplot((Amateur),beside=TRUE,col = c("red","orange"),ylim=c(0,90000))
title(main="Usage of 'Amateur' as a tag from 2007 to 2013")
title(xlab="Amateur")
title(ylab="Frequency")
Plot showing amateur tag over the years
However this isn't exactly a great plot. I'm looking for a way to plot using ggplot ideally and to have the names of each bar to be the year rather than 'Amateur_2010' etc. How do I do this?
An even better bonus if I can add 'nb_views' for each year with this tag usage or something like that.
Upvotes: 1
Views: 2023
Reputation: 11
As Jared said, there are lots of ways, but I want to solve it with your way, so that you can internalize the solution better.
I just changed your cbind in the plot:
Amateur <- cbind("2007" = Amateur_2007,"2008" = Amateur_2008,"2009" = Amateur_2009, "2010" =Amateur_2010, "2011" = Amateur_2011, "2012" = Amateur_2012, "2013" = Amateur_2013)
As you can see, you can give names to your columns into cbind function like that :)
Upvotes: 1
Reputation: 26695
There are lots of ways to approach this, here is how I would tackle it:
library(tidyverse)
library(lubridate)
library(vroom)
xhamster <- vroom("xhamster.csv")
xhamster$upload_date<-as.Date(xhamster$upload_date,format="%d/%m/%Y")
xhamster$Year <- year(ymd(xhamster$upload_date))
xhamster %>%
filter(Year %in% 2007:2013) %>%
filter(grepl("Amateur", channels)) %>%
ggplot(aes(x = Year, y = ..count..)) +
geom_bar() +
scale_x_continuous(breaks = c(2007:2013),
labels = c(2007:2013)) +
ylab(label = "Count") +
xlab(label = "Amateur") +
labs(title = "Usage of 'Amateur' as a tag from 2007 to 2013",
caption = "Data obtained from https://sexualitics.github.io/ under a CC BY-NC-SA 3.0 license") +
theme_minimal(base_size = 14)
Upvotes: 1