user3570187
user3570187

Reputation: 1773

Plotting unique values collapsed by years in R

I have a data set of the form

x<- c("London","Newyork","Miami","London","London","London")
y<- c(2008,2009,2008,2010,2009,2008)
df<- data.frame(x,y)
plot(length(unique(df$x)),y)

Now I want to plot unique values of x(length) and y based on years. I am expecting graph like 2008- 2; 2009-2; 2010-1. I need to collapse based on unique values of the city counts. Any suggestions?

Upvotes: 2

Views: 5267

Answers (2)

akrun
akrun

Reputation: 887891

n_distinct is a convenient function in dplyr to find the count of unique elements. Here, we group by 'y' column and get the n_distinct of 'y'. This can be used for plotting with ggplot

library(dplyr)
library(ggplot2)
df %>% 
     group_by(y) %>% 
     summarise(n=n_distinct(x)) %>%
     ggplot(., aes(x=y, y=n)) +
            geom_bar(stat='identity')

Upvotes: 4

Sven Hohenstein
Sven Hohenstein

Reputation: 81743

You can use tapply to count the distinct values per year and barplot for plotting.

barplot(with(df, tapply(x, y, function(v) length(unique(v)))))

Upvotes: 3

Related Questions