Sarah
Sarah

Reputation: 27

ggplot2 - Barchart ot Histogram in R - plotting more than one variable

So sorry I'm quite new to R and have been trying to do this by myself but have been struggling.

I'm trying to do some sort of barplot or histogram of the tag 'Amateur' over the years 2007 to 2013 to show how it's changed over time.

The data set was downloaded from: https://sexualitics.github.io/ specifically looking at the hamster.csv

Here is some of the initial preprocessing of the data below.

    head(xhamster) # Need to change upload_date into a date column, then add new column containing year
    xhamster$upload_date<-as.Date(xhamster$upload_date,format="%d/%m/%Y")
    xhamster$Year<-year(ymd(xhamster$upload_date)) #Adds new column containing just the year
    xhamster$Year<-as.integer(xhamster$Year) # Changing new Year variable into an interger
    head(xhamster) # Check changes made correctly

The filter for the years:

    Yr2007<-xhamster%>%
      filter_at(vars(Year),any_vars(.%in%c("2007")))
    Yr2008<-xhamster%>%
      filter_at(vars(Year),any_vars(.%in%c("2008")))
    Yr2009<-xhamster%>%
      filter_at(vars(Year),any_vars(.%in%c("2009")))
    Yr2010<-xhamster%>%
      filter_at(vars(Year),any_vars(.%in%c("2010")))
    Yr2011<-xhamster%>%
      filter_at(vars(Year),any_vars(.%in%c("2011")))
    Yr2012<-xhamster%>%
      filter_at(vars(Year),any_vars(.%in%c("2012")))
    Yr2013<-xhamster%>%
      filter_at(vars(Year),any_vars(.%in%c("2013")))

For example, I want to create a plot for the tag 'Amateur' in the data. Here is some of the code I have already done:

    Amateur<-grep("Amateur",xhamster$channels)
    Amateur_2007<-grep("Amateur", Yr2007$channels)
    Amateur_2008<-grep("Amateur", Yr2008$channels)
    Amateur_2009<-grep("Amateur", Yr2009$channels)
    Amateur_2010<-grep("Amateur", Yr2010$channels)
    Amateur_2011<-grep("Amateur", Yr2011$channels)
    Amateur_2012<-grep("Amateur", Yr2012$channels)
    Amateur_2013<-grep("Amateur", Yr2013$channels)

    Amateur_2007 <- length(Amateur_2007)
    Amateur_2008 <- length(Amateur_2008)
    Amateur_2009 <- length(Amateur_2009)
    Amateur_2010 <- length(Amateur_2010)
    Amateur_2011 <- length(Amateur_2011)
    Amateur_2012 <- length(Amateur_2012)
    Amateur_2013 <- length(Amateur_2013)

Plot:

    Amateur <- cbind(Amateur_2007, Amateur_2008, Amateur_2009,Amateur_2010, Amateur_2011, Amateur_2012, Amateur_2013)
    barplot((Amateur),beside=TRUE,col = c("red","orange"),ylim=c(0,90000))
    title(main="Usage of 'Amateur' as a tag from 2007 to 2013")
    title(xlab="Amateur")
    title(ylab="Frequency")

Plot showing amateur tag over the years

enter image description here

However this isn't exactly a great plot. I'm looking for a way to plot using ggplot ideally and to have the names of each bar to be the year rather than 'Amateur_2010' etc. How do I do this?

An even better bonus if I can add 'nb_views' for each year with this tag usage or something like that.

Upvotes: 1

Views: 2023

Answers (2)

As Jared said, there are lots of ways, but I want to solve it with your way, so that you can internalize the solution better.

I just changed your cbind in the plot:

Amateur <- cbind("2007" = Amateur_2007,"2008" = Amateur_2008,"2009" = Amateur_2009, "2010" =Amateur_2010, "2011" = Amateur_2011, "2012" = Amateur_2012, "2013" = Amateur_2013)

As you can see, you can give names to your columns into cbind function like that :)

Upvotes: 1

jared_mamrot
jared_mamrot

Reputation: 26695

There are lots of ways to approach this, here is how I would tackle it:

library(tidyverse)
library(lubridate)
library(vroom)

xhamster <- vroom("xhamster.csv")
xhamster$upload_date<-as.Date(xhamster$upload_date,format="%d/%m/%Y")
xhamster$Year <- year(ymd(xhamster$upload_date))

xhamster %>% 
  filter(Year %in% 2007:2013) %>% 
  filter(grepl("Amateur", channels)) %>%
  ggplot(aes(x = Year, y = ..count..)) +
  geom_bar() +
  scale_x_continuous(breaks = c(2007:2013),
                   labels = c(2007:2013)) +
  ylab(label = "Count") +
  xlab(label = "Amateur") +
  labs(title = "Usage of 'Amateur' as a tag from 2007 to 2013",
       caption = "Data obtained from https://sexualitics.github.io/ under a CC BY-NC-SA 3.0 license") +
  theme_minimal(base_size = 14)

example_1.png

Upvotes: 1

Related Questions