nba2020
nba2020

Reputation: 698

R row-wise entry population based on previous row

I have a R data frame that looks like this:

Category Date        Column
Cat1     2018-03-23  19
Cat1     2018-03-24  19
Cat1     2018-03-25  19
Cat1     2018-03-26  NULL
Cat1     2018-03-27  NULL
...      ....        ...
Cat2     2018-03-23  25
Cat2     2018-03-24  25
Cat2     2018-03-25  25
Cat2     2018-03-26  NULL
Cat2     2018-03-27  NULL
...      ....        ...

and want to replace NULL values with the last number. So for Cat1 the NULL becomes 19 and for Cat2 rows NULL becomes 25. How may I achieve this?

Upvotes: 0

Views: 42

Answers (2)

Jrm_FRL
Jrm_FRL

Reputation: 1423

Your "NULL" is here obviously a character, and not a special value, so you can convert all "NULL" to NAs and then use the tidyr::fill() as someone suggested ?

df <- data.frame(
  Category = c("Cat1", "Cat1", "Cat1", "Cat1", "Cat2"),
  Date = c("2018-03-23", "2018-03-23", "2018-03-23", "2018-03-23", "2018-03-23"),
  Column = c(19, 19, 19, "NULL", 19),
  stringsAsFactors = F
)

# converts 'NULL' or anything to NA
df[df == "NULL"] <- NA

tidyr::fill(df, Column, .direction = "down")

Upvotes: 1

ThomasIsCoding
ThomasIsCoding

Reputation: 102221

Maybe you can try the base R code below, using ave

df <- within(df, Column <- ave(Column, Category, FUN = function(x) ifelse(x=="NULL",tail(x[x!="NULL"],1),x)))

# > df
# Category       Date Column
# 1      Cat1 2018-03-23     19
# 2      Cat1 2018-03-24     19
# 3      Cat1 2018-03-25     19
# 4      Cat1 2018-03-26     19
# 5      Cat1 2018-03-27     19
# 6      Cat2 2018-03-23     25
# 7      Cat2 2018-03-24     25
# 8      Cat2 2018-03-25     25
# 9      Cat2 2018-03-26     25
# 10     Cat2 2018-03-27     25

DATA

df <- structure(list(Category = c("Cat1", "Cat1", "Cat1", "Cat1", "Cat1", 
"Cat2", "Cat2", "Cat2", "Cat2", "Cat2"), Date = c("2018-03-23", 
"2018-03-24", "2018-03-25", "2018-03-26", "2018-03-27", "2018-03-23", 
"2018-03-24", "2018-03-25", "2018-03-26", "2018-03-27"), Column = c("19", 
"19", "19", "19", "19", "25", "25", "25", "25", "25")), row.names = c(NA, 
-10L), class = "data.frame")

Upvotes: 1

Related Questions