Johannes
Johannes

Reputation: 69

Find pattern in data.frame and if reappearing delete rows in-between

Hi there I have a dataframe 'dat2':

   CYC POS REP      CTIME  DTIME1  DTIME2 CUCNTS SQP SQP. STIME CTIME_mins    ID     
1    1   1   1   5:00.900  11.827  11.808  55069   0    0  0:00    5.01500  WSTD 
2    1   2   1  50:01.781  68.202  68.192   1199   0    0  0:00   50.02968    S1    
3    1   3   1 100:01.781 136.185 136.135     21   0    0  0:00  100.02968    B1     
4    1   4   1 100:01.781 136.179 136.134     19   0    0  0:00  100.02968 TSG41     
5    1   5   1 100:01.775 136.180 135.340     16   0    0  0:00  100.02958 TSG42     
6    1   6   1 100:00.781 136.133 136.073     23   0    0  0:00  100.01302 TSG43     
7    1   7   1 100:01.781 136.200 136.146     93   0    0  0:00  100.02968 TSG44     
8    1   8   1 100:01.781 136.186 135.358    161   0    0  0:00  100.02968 TSG45     
9    1   9   1  50:01.781  68.217  68.201   1273   0    0  0:00   50.02968    S2    
10   1  10   1 100:01.780 136.178 136.137     15   0    0  0:00  100.02967    B2     
11   1  21   1   0:25.899   0.596   0.593      1   0    0  0:00    0.43165 TSG46     
12   1   1   1   5:00.900  11.846  11.826  57932   0    0  0:00    5.01500  WSTD 
13   1   2   1  50:01.719  68.379  68.347   1091   0    0  0:00   50.02865    S1

This dataframe has 13 rows (could be a different length, but does not matter). I first of all I need to check the first 3 columns 'CYC' 'POS' 'REP': The pattern is 'CYC == 1 && POS == 1 && REP == 1' First check if the pattern exists more than once in the table. If not then do nothing. If yes, then delete all rows before the second appearance. In this case it should delete rows 1 to 11. I thought I could do something like:

for (i in 1:nrow(dat2)){
     dat2 <- dat2[-i,]
     repeat{ 
       if (dat2[i,"CYC"] == dat2[1,"CYC"] && 
           dat2[i,"POS"] == dat2[1,"POS"] && 
           dat2[i,"REP"] == dat2[1,"REP"]) 

         {break}

     } 
       #dat2trial <- dat2[-c(1:i-1),]
   }

However I seem to have created a loop which goes forever. And just to emphasize, if it does not find a repetition of the stated patter than it should do nothing.

Upvotes: 0

Views: 47

Answers (1)

Rui Barradas
Rui Barradas

Reputation: 76402

If I understand your question correctly, try the following.

inx <- with(dat2, which(CYC == 1 & POS == 1 & REP == 1))
inx <- inx[length(inx)]
dat3 <- if(inx > 1) dat2[-seq_len(inx - 1), ] else dat2
dat3
#   CYC POS REP     CTIME DTIME1 DTIME2 CUCNTS SQP SQP. STIME CTIME_mins   ID
#12   1   1   1  5:00.900 11.846 11.826  57932   0    0  0:00    5.01500 WSTD
#13   1   2   1 50:01.719 68.379 68.347   1091   0    0  0:00   50.02865   S1

Upvotes: 2

Related Questions