JessicaJones
JessicaJones

Reputation: 13

Updating column based on previous row values

I have this dataframe df1.

    User|Date|Index|
    a   |1   |1    |
    a   |1   |2    |
    a   |1   |3    |
    a   |1   |0    |
    a   |1   |5    |
    a   |1   |6    |
    a   |2   |0    |
    b   |4   |1    |
    b   |4   |2    |
    b   |4   |3    |

I want to update the Index column, in the following way:

  1. Group the data by User, Date;
  2. Assume the rows are correctly ordered;
  3. Go through the column Index, when finding a 0 value, update it to 1, and correct the following lines, incrementing by 1 based on the previous line, until another 0 is found.

I've narrowed it down to this, but I'm not sure how complete the replace part to do what I want.

    df1 %>%
    group_by(User, Date) %>%
    mutate(Index = replace(Index,)

Can anybody help me?


EDIT: The dataframe above is a simplification. This is the code.

    df1 <-structure(list(User = c(2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3,3), 
    Date = c(16864, 16864, 16864, 16864, 16864, 16879, 16879,16879, 16879, 16879, 16879, 16879, 16879, 16879), 
    Index = c(16,17, 0, 19, 20, 1, 2, 3, 0, 5, 0, 0, 8, 9)), 
    class = "data.frame", .Names = c("User","Date", "Index"), row.names = c(NA, -14L))

This is the current look:

    User|Date    |Index|
    2   |16864   |16   |
    2   |16864   |17   |
    2   |16864   |0    |
    2   |16864   |19   |
    2   |16864   |20   |
    3   |16879   |1    |
    3   |16879   |2    |
    3   |16879   |3    |
    3   |16879   |0    |
    3   |16879   |5    |
    3   |16879   |0    |
    3   |16879   |0    |
    3   |16879   |8    |
    3   |16879   |9    |

The desired output is:

    User|Date    |Index|
    2   |16864   |16   |
    2   |16864   |17   |
    2   |16864   |1    |
    2   |16864   |2    |
    2   |16864   |3    |
    3   |16879   |1    |
    3   |16879   |2    |
    3   |16879   |3    |
    3   |16879   |1    |
    3   |16879   |2    |
    3   |16879   |1    |
    3   |16879   |1    |
    3   |16879   |2    |
    3   |16879   |3    |

Upvotes: 1

Views: 893

Answers (1)

David Arenburg
David Arenburg

Reputation: 92300

There is probably a smarter way to achieve this, but here's my attempt with a custom function

myfun <- function(x)  { 
  indx <- which(x == 0L)
  c(x[1L:(indx[1L] - 1L)], sequence(c(diff(indx), length(x) - last(indx) + 1L)))
}

df1 %>%
  group_by(User, Date) %>%
  mutate(Index = myfun(Index))

# Source: local data frame [14 x 3]
# Groups: User, Date [2]
#     User  Date Index
#    (dbl) (dbl) (dbl)
# 1      2 16864    16
# 2      2 16864    17
# 3      2 16864     1
# 4      2 16864     2
# 5      2 16864     3
# 6      3 16879     1
# 7      3 16879     2
# 8      3 16879     3
# 9      3 16879     1
# 10     3 16879     2
# 11     3 16879     1
# 12     3 16879     1
# 13     3 16879     2
# 14     3 16879     3

Upvotes: 3

Related Questions