Reputation: 851
Below is my dataframe. It has row names and column names.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
row1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0
row2 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0
I would like to derive a column test based on consecutive zeros (from the last columns, across the columns for each row. Below is an example. For the first row, there are 8 consecutive zeros, so the value in the test row should be 8. for the second row, the result should be 1 as only one zero. ( I want to consider from 15 and go back till where zeros are started).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 test
row1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 8
row2 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 1
What's the best way to achieve this?
Upvotes: 0
Views: 2519
Reputation: 14360
You could simply find the index of the first value that does not equal 0
(starting from the last column) and then subtract one:
df$test2 <- apply(df[,ncol(df):1]==0, 1, which.min) - 1
df
# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 test2
#1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 8
#2 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 1
Another answer:
Since I was curious about a way to do this without apply
-ing over rows I came up with a (admittedly complicated) Reduce
solution. Not a solution I recommend, but one I was interested to see if there was a way to do it:
iniCol <- setNames(df[,ncol(df)] == 0, as.numeric(df[,ncol(df)] == 0))
df$test2 <- Reduce(function(ini, add) {temp <- ifelse(pmin(as.numeric(names(ini)), add==0) == 0, ini, rowSums(cbind(ini, add == 0)))
ini <- setNames(temp, pmin(as.numeric(names(ini)), add==0))},
df[,(ncol(df)-1):1],
ini = iniCol)
The idea behind this is to use the names
attribute to track whether or not a column was ever 0
. If it was then we stop counting, otherwise continue counting.
Upvotes: 1
Reputation: 28339
Solution using rle
:
getConsecZeroRle <- function(x) {
foo <- rle(x)
foo$lengths[tail(which(foo$values), 1)]
}
result <- apply(df[, -1] == 0, 1, function(x) getConsecZeroRle(x))
df$test <- as.numeric(result)
df$test[is.na(df$test)] <- 0
Explanation:
Use apply
to iterate over the subset of your dataframe. For each row calculate length of consecutive zeros (rle
) and extract last value using tail
. Rows that don't have zeros will produce NA
(using is.na(df$test)
) to replace them with zeros.
Solution using sum
:
getConsecZeroSum <- function(x) {
x[1:tail(which(!x), 1)] <- FALSE
sum(x)
}
df$test <- apply(df[, -1] == 0, 1, function(x) getConsecZeroSum(x))
Explanation:
Extract last FALSE
value in each row and turn everything to FALSE
before it (x[1:tail(which(!x), 1)] <- FALSE
) then use sum
to count zero values from the end.
Result:
# a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 test
# 1 row1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 8
# 2 row2 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 1
Upvotes: 4