Milaa
Milaa

Reputation: 419

Extract specific rows with conditions and all columns from dataframe in R

I have a data frame of three columns (A, month, year) and I want to extract from it specific rows (contain the period of say for example from month 10 and year 92 to month 4 and year 93) and all columns.

A<-c(15:34)
Month<-c(9,9,10,10,11,12,1,2,2,2,3,3,4,4,5,6,7,8,10,10)
Year<-rep(c(92, 93), times = c(6,14))
mydata<- data.frame(A, Month, Year)

I have tried this but it did not work

newdata<-mydata[mydata$Month==10 & mydata$Year== 92 : mydata$Month==4 & mydata$Year== 93 ,]

I do not want to do this mydata[3:14, ], as my data frame is very large. that will make me find out by myself from which row to which row of data frame length more than 50000. That is not practical. Is there any way to do so.

The expected result is

enter image description here

Upvotes: 0

Views: 589

Answers (1)

MartijnVanAttekum
MartijnVanAttekum

Reputation: 1445

you were really close with your approach, this would work:

newdata<-mydata[min(which(mydata$Month==10 & mydata$Year== 92)) :
 max(which(mydata$Month==4 & mydata$Year== 93)) ,]

mydata$Month==10 & mydata$Year== 92 will result in a logical vector, which cannot be used to derive a range with :. Ranges can only be created using two integers (one lower and one upper), and the integers you need can be derived from your logical vector using which.

An additional difficulty is that you have duplicate rows in the data frame so each which statement will return multiple integers. To reduce these integers to one value, min and max can be used.

Note that this only works when the rows you are subsetting are in consecutive order in your data frame. Is that the case?

Upvotes: 1

Related Questions