Reputation: 51

How to count consecutive values across columns

In R, I have a dataframe that looks like this:

        2021    2020    2019    2018    2017    2015    2010    2006    2002    1998    1994    1990
1       6       6       6       6       4       6       6       6       6       6       6       6
2       6       6       6       6       6       6       6       6       6       3       4       4
3       7       6       6       6       4       6       6       6       6       6       6       6
4       6       6       6       6       6       6       6       6       6       6       6       6
5       4       4       7       6       4       6       6       6       6       6       4       6
6       6       6       6       6       6       4       6       6       6       2       6       6
...

What I want to do is count how many consecutive values there are from the first column (including the first column), and put the result in a new dataframe called 'age'. So the desired output from the above example would look like this:

I've been playing around with rle, but have had no luck so far. Any ideas?

EDIT to add reproducible data:

df <- structure(list(LC_2021 = c(6, 6, 6, 6, 6, 6), LC_2020 = c(6, 
6, 6, 6, 4, 6), LC_2019 = c(6, 6, 6, 6, 6, 6), LC_2018 = c(6, 
6, 6, 6, 6, 6), LC_2017 = c(6, 6, 6, 6, 6, 6), LC_2015 = c(6, 
6, 6, 6, 6, 6), LC_2010 = c(6, 6, 6, 6, 6, 6), LC_2006 = c(4, 
4, 4, 6, 4, 4), LC_2002 = c(6, 6, 6, 6, 6, 6), LC_1998 = c(6, 
6, 6, 6, 6, 6), LC_1994 = c(6, 6, 6, 6, 6, 6), LC_1990 = c(6, 
6, 6, 6, 6, 6)), row.names = c(NA, 6L), class = "data.frame")

Upvotes: 1

Answers (3)

G. Grothendieck

Reputation: 269654

Assuming that -Inf is not in DF we can use max.col. We create a temporary data frame that has a column of -Inf's to catch any row that is entirely constant. No apply functions or packages are used.

data.frame(age = max.col(cbind(DF, -Inf) != DF[[1]], "first") - 1)
##   age
## 1   4
## 2   9
## 3   1
## 4  12
## 5   2
## 6   5

Note

Lines <- "2021 2020 2019 2018 2017 2015 2010 2006 2002 1998 1994 1990
1    6    6    6    6    4    6    6    6    6    6    6    6
2    6    6    6    6    6    6    6    6    6    3    4    4
3    7    6    6    6    4    6    6    6    6    6    6    6
4    6    6    6    6    6    6    6    6    6    6    6    6
5    4    4    7    6    4    6    6    6    6    6    4    6
6    6    6    6    6    6    4    6    6    6    2    6    6"

DF <- read.table(text = Lines, check.names = FALSE)

Upvotes: 2

Maël

Reputation: 52004

With cummin:

data.frame(age = apply(df, 1, \(x) sum(cummin(x == x[1]))))

#   age
# 1   4
# 2   9
# 3   1
# 4  12
# 5   2
# 6   5

Upvotes: 5

benson23

Reputation: 19097

You can use apply to iterate through the rows, then extract the lengths from rle.

data.frame(age = apply(df, 1, \(x) rle(x)$lengths[[1]]))

  age
1   4
2   9
3   1
4  12
5   2
6   5

Data

df <- structure(list(X2021 = c(6L, 6L, 7L, 6L, 4L, 6L), X2020 = c(6L, 
6L, 6L, 6L, 4L, 6L), X2019 = c(6L, 6L, 6L, 6L, 7L, 6L), X2018 = c(6L, 
6L, 6L, 6L, 6L, 6L), X2017 = c(4L, 6L, 4L, 6L, 4L, 6L), X2015 = c(6L, 
6L, 6L, 6L, 6L, 4L), X2010 = c(6L, 6L, 6L, 6L, 6L, 6L), X2006 = c(6L, 
6L, 6L, 6L, 6L, 6L), X2002 = c(6L, 6L, 6L, 6L, 6L, 6L), X1998 = c(6L, 
3L, 6L, 6L, 6L, 2L), X1994 = c(6L, 4L, 6L, 6L, 4L, 6L), X1990 = c(6L, 
4L, 6L, 6L, 6L, 6L)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6"))

Upvotes: 6

How to count consecutive values across columns

Answers (3)

Note

Data

Related Questions