kputschko
kputschko

Reputation: 816

Count the sequence of numbers while skipping missing values

I have a series of dates and I want to count each record the sequence of dates, while skipping missing values.

Essentially, I want to see the following result, where a are my dates and b is my index of the date record. You can see that row 5 is my 4th record, and visit 7 is my 5th record.

tibble(a = c(12, 24, 32, NA, 55, NA, 73), b = c(1, 2, 3, NA, 4, NA, 5))

      a     b
  <dbl> <dbl>
1    12     1
2    24     2
3    32     3
4    NA    NA
5    55     4
6    NA    NA
7    73     5

It seems that group_by() %>% mutate(sq = sequence(n())) doesn't work in this case, because I don't know how to filter out the missing values while counting. I need to keep those missing values because my data is pretty large.

Is a separate operation of filtering the data, getting the sequence, and using left_join my best option?

Upvotes: 1

Views: 321

Answers (4)

G. Grothendieck
G. Grothendieck

Reputation: 269644

Cumulatively sum an indicator of non-NA and then add 0*a to effectively NA out any component that was originally NA while adding 0 to the rest (so not changing them).

a <- c(12, 24, 32, NA, 55, NA, 73)
cumsum(!is.na(a)) + 0 * a
## [1]  1  2  3 NA  4 NA  5

Upvotes: 3

akrun
akrun

Reputation: 887118

We could specify the i as non-NA logical vector, and create the 'b' by assigning the sequence of rows

library(data.table)
setDT(dat)[!is.na(a),  b := seq_len(.N)]

-output

dat
#    a  b
#1: 12  1
#2: 24  2
#3: 32  3
#4: NA NA
#5: 55  4
#6: NA NA
#7: 73  5

Upvotes: 1

ThomasIsCoding
ThomasIsCoding

Reputation: 101373

Maybe you can try replace + seq_along like below

within(
  df,
  b <- replace(a, !is.na(a), seq_along(na.omit(a)))
)

Upvotes: 1

M--
M--

Reputation: 28850

library(dplyr)

dat <- tibble(a = c(12, 24, 32, NA, 55, NA, 73))

dat %>% 
  mutate(sq = ifelse(is.na(a), NA, cumsum(!is.na(a))))

#> # A tibble: 7 x 2
#>       a    sq
#>   <dbl> <int>
#> 1    12     1
#> 2    24     2
#> 3    32     3
#> 4    NA    NA
#> 5    55     4
#> 6    NA    NA
#> 7    73     5

Upvotes: 3

Related Questions