Reputation: 189
I have a dataframe that looks like this:
x <- c(1,2,3)
y <- c(4,5,5)
df <- data.frame(x,y)
Now I would like to run my dataframe through a function to deal with the 5 values.
So like
getRidOdNAs <- function(df){
for (i in 1:nrow(df)){
if(df$y[i] == 5){
df$y[i] <- 0
}
}
return(df)
}
Now this works. But when I call my dataframe df afterwards I get again:
> df
x y
1 1 4
2 2 5
3 3 5
Any thoughts what I should do so get back the changed dataframe?
Upvotes: 0
Views: 86
Reputation: 269556
Note that getRidOdNAs
does not modify the input df
but rather outputs a new data frame that is a modified version of the input. That output must be assigned to a variable or else it will be lost. Running this code using the function in the question does work:
df.orig <- df # make a copy of df and store it in df.orig
df2 <- getRidOdNAs(df)
df2 # df2 is indeed a modified version of the input, df
## x y
## 1 1 4
## 2 2 0
## 3 3 0
identical(df.orig, df) # df unchanged
## [1] TRUE
Note that this also works:
df3 <- transform(df, y = replace(y, y == 5, 0))
identical(df2, df3) # check that df2 and df3 are identical
## [1] TRUE
As does this:
df4 <- df # make a copy so we can avoid overwriting df
df4$y[df4$y == 5] <- 0 # overwrite df4
identical(df4, df2) # df4 is same as df2
## [1] TRUE
Upvotes: 3