Reputation: 83
I have the following rle object:
Run Length Encoding
lengths: int [1:189] 4 5 3 15 6 4 9 1 9 5 ...
values : logi [1:189] FALSE TRUE FALSE TRUE FALSE TRUE ...
I would like to find the average (mean) of the lengths if the corresponding item in the values == TRUE (I'm not interested in the lengths when values == FALSE)
df <- data.frame(values = NoOfTradesAndLength$values, lengths = NoOfTradesAndLength$lengths)
AveLength <- aggregate(lengths ~ values, data = df, FUN = function(x) mean(x))
Which returns this:
values lengths
1 FALSE 7.694737
2 TRUE 5.287234
I can now obtain the length where values == TRUE but is there a nicer way of doing this? Or perhaps, could I achieve a similar result without using rle at all? It feels a bit fiddly converting from lists to dataframe and I'm sure there is a one line clever way of doing this. I've seen that derivatives of this question have cycled through before but I wasn't able to come up with anything better from those so your help is much appreciated.
Upvotes: 2
Views: 441
Reputation: 887571
The rle
returns a list
of 'lengths' and 'values'. We can subset the 'lengths' using the 'values' as logical index and get the mean
with(NoOfTradesAndLength, mean(lengths[values]))
Using a reproducible example
set.seed(24)
NoOfTradesAndLength <- rle(sample(c(TRUE, FALSE), 25, replace=TRUE))
with(NoOfTradesAndLength, mean(lengths[values]))
#[1] 1.5
Using the OP's code
AveLength[2,]
# values lengths
#2 TRUE 1.5
Upvotes: 5