Francis
Francis

Reputation: 6726

Segmenting data frame by row based on specific pattern

Sorry for not precisely titling the problem. Let me elaborate as follows:

I have data frame like this:

   state
1      v
2      v
3      v
4      v
5      x
6      x
7      x
8      v
9      v
10     x
11     x
12     v
13     v
14     x

I want to segment it into three parts:

First part:

   state
1      v
2      v
3      v
4      v
5      x
6      x
7      x

second part:

8      v
9      v
10     x
11     x

third part:

12     v
13     v
14     x

That is, each part will contain at least two "state" ("v" and "x") regardless of the number and part with pattern "v,v,x,x,v" (x followed with v) should not occur.

Upvotes: 2

Views: 162

Answers (2)

rnso
rnso

Reputation: 24535

Try:

> n=0
> ddf$new = n
> for(i in 2:nrow(ddf)){
+ if(ddf$state[i] =='v' &&  ddf$state[i-1] =='x') {n=n+1}
+ ddf$new[i] = n
+ }
> split(ddf, ddf$new)
$`0`
   sno state new
1:   1     v   0
2:   2     v   0
3:   3     v   0
4:   4     v   0
5:   5     x   0
6:   6     x   0
7:   7     x   0

$`1`
   sno state new
1:   8     v   1
2:   9     v   1
3:  10     x   1
4:  11     x   1

$`2`
   sno state new
1:  12     v   2
2:  13     v   2
3:  14     x   2

Upvotes: 0

Sven Hohenstein
Sven Hohenstein

Reputation: 81683

You can create a unique identifier with cumsum and use split to split the data frame.

split(dat, c(0, cumsum(with(dat, state[-1] == "v" & head(state, -1) == "x"))))

where dat is the name of your data frame.

The result is a list including three data frames.

$`0`
  state
1     v
2     v
3     v
4     v
5     x
6     x
7     x

$`1`
   state
8      v
9      v
10     x
11     x

$`2`
   state
12     v
13     v
14     x

Upvotes: 3

Related Questions