Reputation: 644
I have some data that I need to graph in R. There are two columns of data. The first one is a series of years ranging from 2001 to 2011. The second column is a string. The strings can be anything. I need to make a multi-line graph ( I was trying to use ggplot ) where the occurences of a string is on the y-axis and the year is on the x-axis.
I don't really have much of an idea where to start. This is what I had but I'm not sure if this is correct.
year <- data$year
# Idk how to get occurences per year
# year_2001 <- data$string[data$year == 2001]
# would this work?
# ggplot + geom_line()
I know most of that is commented out but that's because I'm new to R. Any help or guidance is greatly appreciated. Thanks!
Upvotes: 0
Views: 779
Reputation: 1717
Here is one way to get it done.
library(ggplot2)
library(dplyr)
set.seed(272727)
data <- data.frame(year = sample(2001:2011, 100, replace = TRUE),
string = sample(letters[1:5], 100, replace = TRUE))
# this is what will be plotted
table(data$string, data$year)
dataSummary <- as.data.frame(xtabs(~year+string, data))
ggplot(dataSummary, aes(x = year, y = Freq, group = string, colour = string)) + geom_line()
Note my previous answer used dplyr
, but it had an issue with year-string combinations that are zero length. See dplyr summarise: Equivalent of ".drop=FALSE" to keep groups with zero length in output.
Upvotes: 2