Reputation: 9763
I am trying to figure out how to count the number of rows from when one column says True to when the other column says True. I attempted to use run length encoding but couldnt figure out how to get the alternating values form each column.
set.seed(42)
s<-sample(c(0,1,2,3),500,replace=T)
isOverbought<-s==1
isOverSold<-s==0
head(cbind(isOverbought,isOverSold),20)
res<-rle(isOverSold)
tt<-res[res$values==0] #getting when Oversold is true
> head(cbind(isOverbought,isOverSold))
[1,] FALSE FALSE
[2,] FALSE FALSE
[3,] TRUE FALSE <-starting condition is overbought
[4,] FALSE FALSE
[5,] FALSE FALSE
[6,] FALSE FALSE
[7,] FALSE FALSE
[8,] FALSE TRUE <-is oversold. length from overbought to oversold = 5
[9,] FALSE FALSE
[10,] FALSE FALSE
[11,] TRUE FALSE <- is overbought. length from oversold to overbought = 3
[12,] FALSE FALSE
[13,] FALSE FALSE
[14,] TRUE FALSE
[15,] TRUE FALSE
[16,] FALSE FALSE
[17,] FALSE FALSE
[18,] FALSE TRUE <-is oversold. length from overbought to oversold = 7
[19,] TRUE FALSE <- is overbought. length from oversold to overbought = 1
[20,] FALSE FALSE
GOAL
overboughtTOoversold oversoldTOoverbought
5 3
7 1
Upvotes: 3
Views: 111
Reputation: 2986
Here is a short version:
TRUE
, -1 if oversold is TRUE
and NA
if both first 2 cols are FALSE
.( You are interested only in days where the market state switches)na.locf
to fill the NA
s with the last observation carried forwardnow use the rle
function
mktState <- ifelse(df$overBought == TRUE,1,ifelse(df$overSold == TRUE,-1,NA))
mktState <- na.locf(mktState)
to get 'overbought' runs:
> rle(mktState)$lengths[rle(mktState)$values == 1]
[1] 5 7 3 8 2 10 7 3 1 2 4 2 5 6 3 11 4 1 5 2 4 6 1 1 8
[26] 7 3 1 1 1 1 3 2 3 1 6 1 1 1 3 2 4 2 1 6 8 8 1 5 15
[51] 2 5 4 2 1 1 3 4 7 1 7 11 1 3 4 2 4 1
and this will give you the 'oversold' runs:
> rle(mktState)$lengths[rle(mktState)$values == -1]
[1] 3 1 5 6 2 4 1 4 3 3 3 5 2 4 1 14 2 2 10 3 7 1 13 1 1
[26] 3 3 1 6 5 2 1 8 7 2 3 1 1 3 5 1 1 2 3 1 2 2 3 3 1
[51] 8 9 4 2 1 6 2 1 3 2 4 5 1 3 7 4 2 2
Upvotes: 2
Reputation: 7455
The assumption for this answer is that there is at least one overbought/oversold transition (either direction) and hence at least two rows in the data. This condition can easily be checked by counting the number of overbought and oversold conditions and making sure that both are greater than one.
The key is to remove the consecutive overbought and oversold conditions so that we only have alternating overbought and oversold conditions. One way to do this is:
## detect where we are overbought and oversold
i1 <- which(isOverbought)
i2 <- which(isOverSold)
## concatenate into one vector
i3 <- c(i1,i2)
## sort these and get the indices from the sort
i4 <- order(i3)
## at this point consecutive overbought or oversold conditions
## will be marked by a difference of 1 in i4 while alternating
## conditions will be marked by something other than 1. So
## filter those out to get i6. BTW, consecutive here does not mean
## consecutive rows in the data but consecutive occurrence of
## either overbought or oversold conditions without an intervening
## condition of the other. The assumption for at least one transition
## in the data is needed for this to work.
i5 <- diff(i4)
i6 <- i4[c(1,which(i5 != 1)+1)]
## then recover the alternating rows of overbought and oversold conditions in i7
i7 <- i3[i6]
## take the difference and format the output
## I need to credit @akrun for this part
i8 <- diff(i7)
## need to determine which is first
if (i1[1] < i2[1]) {
overboughtTOoversold <- i8[c(TRUE, FALSE)]
oversoldTOoverbought <- i8[c(FALSE, TRUE)]
} else {
overboughtTOoversold <- i8[c(FALSE, TRUE)]
oversoldTOoverbought <- i8[c(TRUE, FALSE)]
}
d1 <- cbind(overboughtTOoversold, oversoldTOoverbought)
print(head(d1))
## overboughtTOoversold oversoldTOoverbought
##[1,] 5 3
##[2,] 7 1
##[3,] 3 5
##[4,] 8 6
##[5,] 2 2
##[6,] 10 4
The cbind
may generate a warning that the columns are not the same length. To get rid of that, just pad with NA
at the end as appropriate.
A more compact version of the above is:
i3 <- c(which(isOverbought), which(isOverSold))
i4 <- order(i3)
i8 <- diff(i3[i4[c(1,which(diff(i4) != 1)+1)]])
if (which(isOverbought)[1] < which(isOverSold)[1]) {
overboughtTOoversold <- i8[c(TRUE, FALSE)]
oversoldTOoverbought <- i8[c(FALSE, TRUE)]
} else {
overboughtTOoversold <- i8[c(FALSE, TRUE)]
oversoldTOoverbought <- i8[c(TRUE, FALSE)]
}
d1 <- cbind(overboughtTOoversold, oversoldTOoverbought)
Upvotes: 2
Reputation: 73385
This is sufficient to solve your problem.
## `a` to `b`
a2b <- function (a, b) {
x <- which(a) ## position of `TRUE` in `a`
y <- which(b) ## position of `TRUE` in `b`
z <- which(a | b) ## position of all `TRUE`
end <- match(y, z) ## match for end position
start <- c(1L, end[-length(end)] + 1L) ## start position
valid <- end > start ## remove cases with `end = start`
z[end[valid]] - z[start[valid]]
}
## cross `a` and `b`
axb <- function (a, b) {
if (any(a & b))
stop ("Invalid input! `a` and `b` can't have TRUE at the same time!")
x <- a2b(a, b); y <- a2b(b, a)
if (which(a)[1L] < which(b)[1L]) cbind(a2b = x, b2a = c(NA_integer_, y))
else cbind(a2b = c(NA_integer_, x), b2a = y)
}
For your isOverbought
and isOverSold
, we obtain:
result <- axb(isOverbought, isOverSold)
head(result)
# a2b b2a
#[1,] 5 NA
#[2,] 7 3
#[3,] 3 1
#[4,] 8 5
#[5,] 2 6
#[6,] 10 2
Since isOverbought
has the first TRUE
before isOverSold
, the first element of the 2nd column is NA
.
Upvotes: 2
Reputation: 43354
Here's a [somewhat long] tidyverse version:
library(dplyr)
library(tidyr)
# put vectors in a data.frame
data.frame(isOverbought, isOverSold) %>%
# evaluate each row separately
rowwise() %>%
# add column with name of event for any TRUE, else NA
mutate(change_type = ifelse(isOverbought | isOverSold, names(.)[c(isOverbought, isOverSold)], NA)) %>%
# reset grouping
ungroup() %>%
# replace NA values with last non-NA value
fill(change_type) %>%
# add a column of the cumulate number of changes in change_type
mutate(changes = data.table::rleid(change_type)) %>%
# count number of rows in each changes and change_type grouping
count(changes, change_type) %>%
# remove leading NAs
na.omit() %>%
# reset grouping
ungroup() %>%
# edit change into runs of two with integer division
mutate(changes = changes %/% 2) %>%
# spread to wide form
spread(change_type, n) %>%
# get rid of extra column
select(-changes)
## # A tibble: 68 x 2
## isOverbought isOverSold
## * <int> <int>
## 1 5 3
## 2 7 1
## 3 3 5
## 4 8 6
## 5 2 2
## 6 10 4
## 7 7 1
## 8 3 4
## 9 1 3
## 10 2 3
## # ... with 58 more rows
Upvotes: 0