Nick Knauer
Nick Knauer

Reputation: 4243

Replacing NA in column with values in adjacent column

I have a dataframe as such:

  A       B        C       D       E       F        G        H
  a     LOW      1.5     0.2      NA    1000     2000       NA
  b     LOW      2.9     0.4    HIGH    6000     1000       NA
  c     LOW        1     1.3     LOW     400     1111      LOW 
  d     LOW        2      10     LOW    1000      400     HIGH

How do I do a conditional statement to replace the NA values.

For column E, I want to take the difference of column C and column D, and if it is less than 0, display "small decrease", if it is above 0, then display "small increase".

And then for for column H, do the same thing except using the difference of column F and column G. Display "small decrease" if below 0, and "small increase" if above 0.

Final output should look like this:

  A       B        C       D                 E       F        G                    H
  a     LOW      1.5     0.2    Small Increase    1000     2000       Small Decrease
  b     LOW      2.9     0.4              HIGH    6000     1000       Small Increase
  c     LOW        1     1.3               LOW     400     1111                  LOW 
  d     LOW        2      10               LOW    1000      400                 HIGH

Upvotes: 0

Views: 228

Answers (2)

akrun
akrun

Reputation: 887088

Here is an option using set from data.table which would be very efficient as it assigns value in place

library(data.table)
setDT(df1)#converts 'data.frame' to 'data.table'
#loop through the index of the concerned columns
for(j in c(5L, 8L)) {
  #get the row index of NA for each column
  i1 <- which(is.na(df1[[j]])) 
  #get the value to be replaced based on the difference
  val <- c("Small Increase", "Small Decrease")[((df1[[j-2]][i1] - df1[[j-1]][i1]) < 0) + 1]
  #set the NA elements to the above val
  set(df1, i = i1, j = j, value = val)
 }

df1
#   A   B   C    D              E    F    G              H
#1: a LOW 1.5  0.2 Small Increase 1000 2000 Small Decrease
#2: b LOW 2.9  0.4           HIGH 6000 1000 Small Increase
#3: c LOW 1.0  1.3            LOW  400 1111            LOW
#4: d LOW 2.0 10.0            LOW 1000  400           HIGH

data

df1 <- structure(list(A = c("a", "b", "c", "d"), B = c("LOW", "LOW", 
"LOW", "LOW"), C = c(1.5, 2.9, 1, 2), D = c(0.2, 0.4, 1.3, 10
), E = c(NA, "HIGH", "LOW", "LOW"), F = c(1000L, 6000L, 400L, 
1000L), G = c(2000L, 1000L, 1111L, 400L), H = c(NA, NA, "LOW", 
"HIGH")), .Names = c("A", "B", "C", "D", "E", "F", "G", "H"), 
class = "data.frame", row.names = c(NA, -4L))

Upvotes: 0

joel.wilson
joel.wilson

Reputation: 8413

perform similar steps for other column too!

df$E <- ifelse(is.na(df$E), ifelse(df$C-df$D <0,"small decrease","small increase"), df$E)

Upvotes: 3

Related Questions