Reputation: 157

replacing 0 with information of other column[dplyr]

example.df <- data.frame(GLX = sample(300:600, 200, replace = T), GLY = sample(300:600, 200, replace = T), 
                         GRX = sample(300:600, 200, replace = T), GRY = sample(300:600, 200, replace = T))
    example.df$GLX[1:20] <- 0
    example.df$GLY[1:20] <- 0
    example.df$GRX[70:100] <- 0
    example.df$GRY[70:100] <- 0
    example.df[150:170, ] <- 0

I have a data.frame containing eye coordinates(X & Y) of the Left(GL) and right (GR) eye.

In the case both GLX and GLY are 0, I would like the 0's to be replaced with GRX and GRY, respectively. I also want this to happen the other way around.

In case all 4 columns are 0, I don't want any further action. I already made a for loop, but this is terribly slow. Is there any way of doing this with dplyr? I cant get it to work.

Thanks alot!

Upvotes: 1

Answers (3)

Gregor Thomas

Reputation: 146174

I'd just do a direct replacement in base:

l_0 = example.df$GLX == 0 & example.df$GLY == 0
r_0 = example.df$GRX == 0 & example.df$GRY == 0

example.df[l_0 & ! r_0, c("GLX", "GLY")] = example.df[l_0 & ! r_0, c("GRX", "GRY")]
example.df[r_0 & ! l_0, c("GRX", "GRY")] = example.df[r_0 & ! l_0, c("GLX", "GLY")]

To my knowledge, dplyr doesn't have a convenient way to replace multiple columns at once for a single condition, which makes it more convenient to do in base. While dplyr usually saves typing and makes things readable compared to base, I find the above quite readable and the dplyr alternative annoyingly long and undreadable/prone to typos due to the repetition with minor changes.

example.df %>% mutate(
  GLX = if_else(GLX==0 & GLY==0, GRX, GLX),
  GLY = if_else(GLX==0 & GLY==0, GRY, GLY),
  GRX = if_else(GRX==0 & GRY==0, GLX, GRX),
  GRY = if_else(GRX==0 & GRY==0, GLY, GRY)
)

In case all 4 columns are 0, I don't want any further action.

I wrote code to match what you described in the question, but it could be simplified a bit if we ignore the "in case all 4 columns are 0" bit -- if all 4 columns are 0 then replacing the 0s with each other doesn't hurt anything. This would let the conditions be simply l_0 and r_0 instead of l_0 & ! r_0 and r_0 & ! l_0.

Upvotes: 2

Frank

Reputation: 66819

Another way:

library(data.table)
setDT(example.df)

lcols = c("GLX", "GLY"); rcols = c("GRX", "GRY")
example.df[.(0,0), on=lcols, (lcols) := .SD, .SDcols=rcols]
example.df[.(0,0), on=rcols, (rcols) := .SD, .SDcols=lcols]

This is using a join "on" each pair of columns to find rows where the replacement should be made.

As Gregor suggested, I'm ignoring the redundant condition "In case all 4 columns are 0, I don't want any further action."

Upvotes: 1

Richard J. Acton

Reputation: 915

You can use the form below, adding additional if_elses to the mutate for the other columns:

example.df %>% mutate(GLX = if_else(GLX==0 & GLY==0,GRX,GLX))

if_else evaluates the expression in the first postion returns the value in the second if true and the value in the last if false

Upvotes: 1

replacing 0 with information of other column[dplyr]

Answers (3)

Related Questions