Reputation: 3
This is my dataframe
Age<-c(10, 20, 15, NA, 34, NA, 40, NA, 50, NA)
Salary<-c(100,120, 113,140,150, 160, 170, 180, 190, 200 )
dat<-data.frame(Age, Salary)
I want to impute missing values of Age with value 12 only when Salary < 150 and I want to impute the missing values of Age with value 30 only when Salary >150. I have been trying to do this using dplyr but unable to find a way as I am new to R. How would i write this query in R? Thanks
Upvotes: 0
Views: 49
Reputation: 1114
Using data.table:
library(data.table)
dat <- data.table(dat)
dat[ is.na(Age) & Salary <150, Age:=12,]
dat[ is.na(Age) & Salary >150, Age:=30,]
> dat
Age Salary
1: 10 100
2: 20 120
3: 15 113
4: 12 140
5: 34 150
6: 30 160
7: 40 170
8: 30 180
9: 50 190
10: 30 200
It is not a "oneliner" solution, but is easy to understand if you are a beginner with R.
Upvotes: 1
Reputation: 1110
This could be an option:
dat$Age[which(is.na(dat$Age))] = ifelse(dat$Salary[which(is.na(dat$Age))]<150,12,30)
Upvotes: 0