Reputation: 1332
In my data I have two columns that contain NA values. So I made a new dataframe which has no NA values, I removed the rows which contaied NA values.
What I want is that each time there is an NA value in the original data (called metadata
here), I want to sample randomly one sample from the new data frame (called temp
).. (I removed the NAs so there is no risk of picking NA again).
However, my original data is not changing, it stays the same after performing this:
temp = metadata %>% drop_na()
for (i in length(metadata$Gender)){
if (is.na(metadata$Gender[[i]])) {
metadata$Gender[[i]] = sample(temp$Gender, 1)
}
if (is.na(metadata$Age[[i]])){
metadata$Age[[i]] = sample(temp$Age, 1)
}
}
Upvotes: 0
Views: 41
Reputation: 887501
Instead of creating another object and replacing the NA based on that, we can loop across
the columns of interest, replace
the NA elements with the sample
on non-NA elements and specify the size
as the count of NA elements
library(dplyr)
metadata <- metadata %>%
mutate(across(c(Gender, Age), ~ replace(.x, is.na(.x),
sample(.x[!is.na(.x)], size = sum(is.na(.x)), replace = FALSE))))
Upvotes: 1