Reputation: 380
Here is what I try to achieve on whatiwant
column:
df1 <- data.frame(value = c(99.99,99.98,99.97,99.96,99.95,99.94,
99.93,99.92,99.91,99.9,99.9,99.9),
new_value = c(NA,NA,99.98,NA,99.97,NA,
NA,NA,NA,NA,NA,NA),
whatiswant = c(99.99,99.96,99.98,99.95,99.97,99.94,
99.93,99.92,99.91,99.9,99.9,99.9))
To explain it with words whatiswant
should have the value of new_value
and for those not having the new_value
, it should take the next lowest value available.
I think it is kind of a lag stuff. Here is the data.frame:
value new_value whatiswant
1 99.99 NA 99.99
2 99.98 NA 99.96
3 99.97 99.98 99.98
4 99.96 NA 99.95
5 99.95 99.97 99.97
6 99.94 NA 99.94
7 99.93 NA 99.93
8 99.92 NA 99.92
9 99.91 NA 99.91
10 99.90 NA 99.90
11 99.90 NA 99.90
12 99.90 NA 99.90
EDIT: Logic explained in following steps:
Upvotes: 0
Views: 169
Reputation: 15784
In form of a function, each step in comment, ask if it's unclear:
t1 <- function(df) {
df[,'whatiswant'] <- df[,'new_value'] # step 1, use value of new_value
sapply(1:nrow(df),function(row) { # loop on each row
x <- df[row,] # take the row, just to use a single var instead later
ret <- unlist(x['whatiswant']) # initial value
if(is.na(ret)) { # If empty
if (x['value'] %in% df$whatiswant) { # test if corresponding value is already present
ret <- df$value[!df$value %in% df$whatiswant][1] # If yes take the first value not present
} else {
ret <- unlist(x['value']) # if not take this value
}
}
if(is.na(ret)) ret <- min(df$value) # No value left, take the min
df$whatiswant[row] <<- ret # update the df from outside sapply so the next presence test is ok.
})
return(df) # return the updated df
}
Output:
>df1[,3] <- NA # Set last column to NA
> res <- t1(df1)
> res
value new_value whatiswant
1 99.99 NA 99.99
2 99.98 NA 99.96
3 99.97 99.98 99.98
4 99.96 NA 99.95
5 99.95 99.97 99.97
6 99.94 NA 99.94
7 99.93 NA 99.93
8 99.92 NA 99.92
9 99.91 NA 99.91
10 99.90 NA 99.90
11 99.90 NA 99.90
12 99.90 NA 99.90
Upvotes: 2