Reputation: 489
I have a dataset with dates in one field and N/As in another. I created this as a subset of a larger dataset because I need to see whether the number of N/As are from one time period or more evenly distributed across all time.
my data looks like this:
User_id | Date | app_version
001 | 2016-01-03 | <NA>
002 | 2016-03-03 | <NA>
003 | 2016-02-22 | <NA>
004 | 2016-04-15 | <NA>
...
What I'd like to do is plot a line graph with time on the X axis and number of NAs on the Y axis.
Thanks in advance.
Upvotes: 1
Views: 511
Reputation: 73
library(plyr)
#create a field that breaks the dates down to just year & month
#You can break it down by year if you'd like
df$yr_mth<-substr(df$Date, 1, 7)
#summarize the number of NAs per year_month
df1<-ddply(df, .(yr_mth), summarize,
num_na=length(which(is.na(app_version))))
#plot yr_mth on x, num_na on y
ggplot(data=df1, aes(x=as.Date(yr_mth), y=num_na))+
geom_point()
Upvotes: 0
Reputation: 8072
Using dplyr
and ggplot2
: Group your data accordingly, summarize and count the number of NA values, then plot. (In this case, I grouped by Date
and added geom_point
to show each date.)
library(dplyr)
library(ggplot2)
df %>%
group_by(Date) %>%
summarize(na_count = sum(is.na(app_version))) %>%
ggplot(aes(x = Date, y = na_count)) +
geom_line() +
geom_point()
Upvotes: 1
Reputation: 5017
Your db
User_id<-c("001","002","003","004")
Date<-c("2016-01-03","2016-03-03","2016-02-22","2016-04-15")
app_version<-c(NA,NA,NA,NA)
db<-data.frame(cbind(User_id,Date,app_version))
Your graph
plot(table(db[is.na(db$app_version),"Date"]),type="l")
Upvotes: 0