select the data by month and years in R

Question

I have a data frame ordered by month and year. I want to select only the integer number of years i.e. if the data start in July 2002 and ends in September 2010 then select only data from July 2002 to June 2010. And if the data starts in September 1992 and ends in March 2000 then select only data from September 1992 to August 1999. Regardless of the missing months in between.

The data can be uploaded from the following link: enter link description here

The code

mydata <- read.csv("E:/mydata.csv", stringsAsFactors=TRUE)

this is manually selection

selected.data <- mydata[1:73,]   # July 2002 to June 2010

how to achieve that by coding.

Elia · Accepted Answer

Here is a base solution, that reproduce your manual subsetting:

mydata <- read.csv("D:/mydata.csv", stringsAsFactors=F)
lookup <-
  c(
    January = 1,
    February = 2,
    March = 4,
    April = 4,
    May = 5,
    June = 6,
    July = 7,
    August = 8,
    September = 9,
    October = 10,
    November = 11,
    December = 12
  )
mydata$Month <- unlist(lapply(mydata$Month, function(x) lookup[match(x, names(lookup))]))

first.month <- mydata$Month[1]
last.year <- max(mydata$Year)
mydata[1:which(mydata$Month==(first.month -1)&mydata$Year==last.year),]

Basically, I convert the Month name in number and find the month preceding the first month that appears in the dataframe, for the last year of the dataframe.

select the data by month and years in R

Answers (2)

Related Questions