Reputation: 419
I have a data frame of three columns (A, month, year) and I want to extract from it specific rows (contain the period of say for example from month 10 and year 92 to month 4 and year 93) and all columns.
A<-c(15:34)
Month<-c(9,9,10,10,11,12,1,2,2,2,3,3,4,4,5,6,7,8,10,10)
Year<-rep(c(92, 93), times = c(6,14))
mydata<- data.frame(A, Month, Year)
I have tried this but it did not work
newdata<-mydata[mydata$Month==10 & mydata$Year== 92 : mydata$Month==4 & mydata$Year== 93 ,]
I do not want to do this mydata[3:14, ]
, as my data frame is very large.
that will make me find out by myself from which row to which row of data frame length more than 50000. That is not practical.
Is there any way to do so.
The expected result is
Upvotes: 0
Views: 589
Reputation: 1445
you were really close with your approach, this would work:
newdata<-mydata[min(which(mydata$Month==10 & mydata$Year== 92)) :
max(which(mydata$Month==4 & mydata$Year== 93)) ,]
mydata$Month==10 & mydata$Year== 92
will result in a logical vector, which cannot be used to derive a range with :
. Ranges can only be created using two integers (one lower and one upper), and the integers you need can be derived from your logical vector using which
.
An additional difficulty is that you have duplicate rows in the data frame so each which
statement will return multiple integers. To reduce these integers to one value, min
and max
can be used.
Note that this only works when the rows you are subsetting are in consecutive order in your data frame. Is that the case?
Upvotes: 1