Fetch the values in a data.frame with "winning streak" 2

Question

I have the following df (my actual df is bigger but this is better to explain)

df
    Date   Team Result
1 20-10-2014  Bulls      W
2 20-11-2014  Bulls      W
3 21-11-2014  Bulls      W
4 22-10-2014  Bulls      L
5 23-10-2014  Bulls      L
6 24-10-2014  Bulls      W
7 25-10-2014  Bulls      W
8 26-10-2014  Bulls      L

I used rle to calculate winning streaks

df$Result <- as.character(df$Result)
result.rle <- rle(df$Result)

And the maximum amount of streaks

sort(w.rle$lengths, decreasing = TRUE)

What I would like to do now however is to filter the relevant matches with a winning streak above 2.

I checked out ?rle and couldn't find what i was looking for. Any thoughts on how I can do this?

My end result should be:

#winning streak > 2
        Date   Team Result
1 20-10-2014  Bulls      W
2 20-11-2014  Bulls      W
3 21-11-2014  Bulls      W

etienne · Accepted Answer

A solution using rle :

df[intersect(which(rep(rle(df$Result)$lengths,rle(df$Result)$lengths)>2),which(rep(rle(df$Result)$values,rle(df$Result)$lengths)=='W')),]
#         Date  Team Result
# 1 20-10-2014 Bulls      W
# 2 20-11-2014 Bulls      W
# 3 21-11-2014 Bulls      W

DATA :

df <- structure(list(Date = c("20-10-2014", "20-11-2014", "21-11-2014", 
"22-10-2014", "23-10-2014", "24-10-2014", "25-10-2014", "26-10-2014"
), Team = c("Bulls", "Bulls", "Bulls", "Bulls", "Bulls", "Bulls", 
"Bulls", "Bulls"), Result = c("W", "W", "W", "L", "L", "W", "W", 
"L")), .Names = c("Date", "Team", "Result"), class = "data.frame", row.names = c(NA, 
-8L))

rle(df$Result) returns a list of two elements : the number of repetitions and the values of the repetitions. We replicate the lengths (we can use rle(df$Result)$lengths) to have a vector of the same length as the number of rows of df. We then select the rows with a repetition > 2.

The intersect takes the intersection of the index of the rows where :

there is a >2 streak
it's a winning streak

In detail :

which(rep(rle(df$Result)$values,rle(df$Result)$lengths)=='W')
# [1] 1 2 3 6 7 # Those are the rows where it's a Win. Only those rows can be included

which(rep(rle(df$Result)$lengths,rle(df$Result)$lengths)>2)
# [1] 1 2 3 # those are the rows where it's a streak of 3 wins or losses

The intersection will give the expected output.

Fetch the values in a data.frame with "winning streak" > 2

Answers (2)

Related Questions

Fetch the values in a data.frame with &quot;winning streak&quot; &gt; 2

Answers (2)

Related Questions

Fetch the values in a data.frame with "winning streak" > 2