John Dwyer
John Dwyer

Reputation: 189

Function to transform a dataframe is not working properly

I have a dataframe that looks like this:

 x <- c(1,2,3)
 y <- c(4,5,5)

 df <- data.frame(x,y)

Now I would like to run my dataframe through a function to deal with the 5 values.

So like

getRidOdNAs <- function(df){

  for (i in 1:nrow(df)){

    if(df$y[i] == 5){

       df$y[i] <- 0

    }
  }
 return(df)
 }

Now this works. But when I call my dataframe df afterwards I get again:

> df
  x y
  1 1 4
  2 2 5
  3 3 5

Any thoughts what I should do so get back the changed dataframe?

Upvotes: 0

Views: 86

Answers (1)

G. Grothendieck
G. Grothendieck

Reputation: 269556

Note that getRidOdNAs does not modify the input df but rather outputs a new data frame that is a modified version of the input. That output must be assigned to a variable or else it will be lost. Running this code using the function in the question does work:

df.orig <- df # make a copy of df and store it in df.orig

df2 <- getRidOdNAs(df)
df2 # df2 is indeed a modified version of the input, df
##   x y
## 1 1 4
## 2 2 0
## 3 3 0

identical(df.orig, df) # df unchanged
## [1] TRUE

Note that this also works:

df3 <- transform(df, y = replace(y, y == 5, 0))

identical(df2, df3) # check that df2 and df3 are identical
## [1] TRUE

As does this:

df4 <- df # make a copy so we can avoid overwriting df
df4$y[df4$y == 5] <- 0  # overwrite df4

identical(df4, df2) # df4 is same as df2
## [1] TRUE

Upvotes: 3

Related Questions