CMilby
CMilby

Reputation: 644

Plot Number of Categories Per Year

I have some data that I need to graph in R. There are two columns of data. The first one is a series of years ranging from 2001 to 2011. The second column is a string. The strings can be anything. I need to make a multi-line graph ( I was trying to use ggplot ) where the occurences of a string is on the y-axis and the year is on the x-axis.

I don't really have much of an idea where to start. This is what I had but I'm not sure if this is correct.

year <- data$year
# Idk how to get occurences per year
# year_2001 <- data$string[data$year == 2001]
# would this work?

# ggplot + geom_line()

I know most of that is commented out but that's because I'm new to R. Any help or guidance is greatly appreciated. Thanks!

Upvotes: 0

Views: 779

Answers (1)

vpipkt
vpipkt

Reputation: 1717

Here is one way to get it done.

library(ggplot2)
library(dplyr)

set.seed(272727)
data <- data.frame(year = sample(2001:2011, 100, replace = TRUE),
               string = sample(letters[1:5], 100, replace = TRUE))
# this is what will be plotted
table(data$string, data$year)
dataSummary <- as.data.frame(xtabs(~year+string, data))
ggplot(dataSummary, aes(x = year, y = Freq, group = string, colour = string)) + geom_line()

Resulting ggplot

Note my previous answer used dplyr, but it had an issue with year-string combinations that are zero length. See dplyr summarise: Equivalent of ".drop=FALSE" to keep groups with zero length in output.

Upvotes: 2

Related Questions