Reputation: 19
My intention is to add a vector to a data frame which includes consecutive numbers corresponding to sequences of consecutive numbers in another vector.
For example, in the data frame below, I would like to add automatically a vector V11 which holds consecutive numbers, one for each sequence of consecutive numbers in V1. In other words, I would like to add a vector V11 with consecutive numbers, one number for each sentence in V2.
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11
1 1 I _ PRON PRP _ 2 nsubj _ _ 1
2 2 saw _ VERB VBD _ 0 ROOT _ _ 1
3 3 a _ DET DT _ 4 det _ _ 1
4 4 man _ NOUN NN _ 2 dobj _ _ 1
5 5 with _ ADP IN _ 4 prep _ _ 1
6 6 glasses _ NOUN NNS _ 5 pobj _ _ 1
7 7 . _ . . _ 2 punct _ _ 1
8 1 I _ PRON PRP _ 2 nsubj _ _ 2
9 2 saw _ VERB VBD _ 0 ROOT _ _ 2
10 3 a _ DET DT _ 4 det _ _ 2
11 4 woman _ NOUN NN _ 2 dobj _ _ 2
12 5 . _ . . _ 2 punct _ _ 2
I am vaguely guessing that this should be possible using a for-loop but I am not competent to program one.
Thank you in advance for your answers.
Upvotes: 0
Views: 135
Reputation: 887571
A base R solution would be
cumsum(c(TRUE, diff(df1$V1) < 0))
#[1] 1 1 1 1 1 1 1 2 2 2 2 2
Upvotes: 1
Reputation: 144
This should work for you. mydata is the name of your data frame
mydata$V11=1
j=1
for(i in 2:nrow(mydata))
{
if(mydata$V1[i]==(mydata$V1[i-1]+1))
mydata$V11[i]=j
else
{
j=j+1
mydata$V11[i]=j
}
}
Upvotes: 0