Reputation: 3
Hi I am trying to change all my column names to different names and then convert all my column vectors which hold negative values to NA. I got the second part right but for some reason I am unable to properly change the column names to different names. This is my code; note that mscr is the csv with the column names I wish to change; I just rename it to df2. Thank you for your time and help.
df2 <- mscr %>%
rename(
caseid = R0000100,
children2000 = R6389600
)
df2 <- mscr
df2[df2 < 0] <- NA
Upvotes: 0
Views: 177
Reputation: 160447
I might be misunderstanding, but I think what you're doing is renaming the columns (successfully), and then over-writing the newly-renamed data with the original. That is,
df2 <- mscr %>% rename(...)
is correct, and the names should then be changed. The moment you then do
df2 <- msvr
before you then replace non-positive values, you revert any changes you made.
rename
(and just about every "verb" function in dplyr
and many in R) operates solely in a functional manner, which means the input data is completely unchanged. If it were changed in-place, this would be "side effect", and antithetic to the "normal/idiomatic way" to do things in R.
Try this:
library(dplyr)
df2 <- mscr %>%
rename(
caseid = R0000100,
children2000 = R6389600
) %>%
mutate(across(everything(), ~ if_else(. < 0, .[NA], .)))
One would normally want to use just NA
, but since NA
is technically a logical
class, and I'm inferring that your data is numeric
or integer
, we need to get the right class. One option is to do this step individually for numeric
and then integer
columns, for which we would use NA_real_
and NA_integer_
, respectively. However, .[NA]
in this case will give the NA
classed the same as the original column data.
Upvotes: 1