Krishnang K Dalal
Krishnang K Dalal

Reputation: 2556

Filter time series data based on a condition R using dplyr

I have a time series data which looks as below.

                         keyword             byMonth        n_views
business tax preparation software           Dec-2016              5
    corporate income tax solution           Nov-2016              3
    corporate income tax solution           Mar-2017              2
          corporate tax provision           Dec-2016              5
          corporate tax provision           Oct-2016              1
                  data collection           Mar-2017             39
                  data collection           May-2017             26
                  data collection           Apr-2017             22
                  data collection           Feb-2017             15
                  data collection           Jan-2017             15
                  data collection           Nov-2016             13
                  data collection           Dec-2016              7
                  data collection           Oct-2016              6

I want to select only those keywords that are throughout Oct-2016 to May-2017 using dplyr or any other convenient method. So in this case, only observations associated with Keyword: data collection should be the output. I am having a hard time figuring this out. Thanks a ton in advance.

Upvotes: 0

Views: 460

Answers (1)

www
www

Reputation: 39154

We can use functions from dplyr and tidyr. The keyword in dt2 are the cases with complete month coverage.

library(dplyr)
library(tidyr)

dt2 <- dt %>%
  # Spread the data frame
  spread(byMonth, n_views) %>%
  # Filter rows without any NA
  filter(rowSums(!is.na(.)) == ncol(.))

Update: Convert the data frame back to the original format

If the original format is needed, we can use gather to convert it back.

dt3 <- dt2 %>%
  gather(byMonth, n_views, -keyword)

Data Preparation

dt <- read.table(text = "                        keyword             byMonth        n_views
'business tax preparation software'           'Dec-2016'              5
               'corporate income tax solution'           'Nov-2016'              3
               'corporate income tax solution'           'Mar-2017'              2
               'corporate tax provision'           'Dec-2016'              5
               'corporate tax provision'           'Oct-2016'              1
               'data collection'           'Mar-2017'             39
               'data collection'           'May-2017'             26
               'data collection'           'Apr-2017'             22
               'data collection'           'Feb-2017'             15
               'data collection'           'Jan-2017'             15
               'data collection'           'Nov-2016'             13
               'data collection'           'Dec-2016'              7
               'data collection'           'Oct-2016'              6",
               header = TRUE, stringsAsFactors = FALSE)

Upvotes: 2

Related Questions