Ranking repeating group of values in a Column using R

Question

I have a email meta-data table that is sorted in the below order, wherein I know that the data is sorted and each occurrence of "From" means that the next set of entries represent attributes of another email.

The column has repeating patterns as below :

 ============== 

      Tag       
 ==============

  From          
  Recepient     
  CC_Recepient  
  CC_Recepient  
  Subject       
  From          
  Recepient     
  CC_Recepient  
  Subject       
  From          
  Recepient     
  Subject       
  From          
  etc..         
 ==============

I need to create a second column which is a unique identifier for each email related group of entries as below. Repeating ocurrence of "From" is the only way I have to identify the start of next group of entries.

Tag Identifier
From 1
Recepient 1
CC_Recepient 1
CC_Recepient 1
Subject 1
From 2
Recepient 2
CC_Recepient 2
Subject 2
From 3  
Recepient 3
Subject 3
From 4
etc..

akuiper · Accepted Answer

You can check if Tag is equal to From, and then do cumsum on the conditions:

df$Identifier <- cumsum(df$Tag == "From")
df
#            Tag Identifier
#1          From          1
#2     Recepient          1
#3  CC_Recepient          1
#4  CC_Recepient          1
#5       Subject          1
#6          From          2
#7     Recepient          2
#8  CC_Recepient          2
#9       Subject          2
#10         From          3
#11    Recepient          3
#12      Subject          3
#13         From          4
#14        etc..          4

Ranking repeating group of values in a Column using R

Answers (1)

Related Questions