jceg316
jceg316

Reputation: 489

How to plot a count of N/As over time in ggplot2

I have a dataset with dates in one field and N/As in another. I created this as a subset of a larger dataset because I need to see whether the number of N/As are from one time period or more evenly distributed across all time.

my data looks like this:

User_id |    Date    | app_version
001     | 2016-01-03 | <NA>
002     | 2016-03-03 | <NA>
003     | 2016-02-22 | <NA>
004     | 2016-04-15 | <NA>
...

What I'd like to do is plot a line graph with time on the X axis and number of NAs on the Y axis.

Thanks in advance.

Upvotes: 1

Views: 511

Answers (3)

ITM
ITM

Reputation: 73

library(plyr)
#create a field that breaks the dates down to just year & month
#You can break it down by year if you'd like
df$yr_mth<-substr(df$Date, 1, 7)
#summarize the number of NAs per year_month 
df1<-ddply(df, .(yr_mth), summarize, 
    num_na=length(which(is.na(app_version))))
#plot yr_mth on x, num_na on y
ggplot(data=df1, aes(x=as.Date(yr_mth), y=num_na))+
    geom_point()

Upvotes: 0

Jake Kaupp
Jake Kaupp

Reputation: 8072

Using dplyr and ggplot2: Group your data accordingly, summarize and count the number of NA values, then plot. (In this case, I grouped by Date and added geom_point to show each date.)

library(dplyr)
library(ggplot2)

df %>% 
  group_by(Date) %>% 
  summarize(na_count = sum(is.na(app_version))) %>% 
  ggplot(aes(x = Date, y = na_count)) +
  geom_line() +
  geom_point()

enter image description here

Upvotes: 1

Terru_theTerror
Terru_theTerror

Reputation: 5017

Your db

User_id<-c("001","002","003","004")
Date<-c("2016-01-03","2016-03-03","2016-02-22","2016-04-15")
app_version<-c(NA,NA,NA,NA)

db<-data.frame(cbind(User_id,Date,app_version))

Your graph

plot(table(db[is.na(db$app_version),"Date"]),type="l")

Your plot

Upvotes: 0

Related Questions