Reputation: 63
I am trying to replace default "1900-01-01" for all column name that matches with date and DOB with na value. Below are the code.
library(dplyr)
data[is.na(data),] %>% select(matches("date"),DOB) %>% head()
FirstTradeDate LastTradeDate DOB
NA <NA> <NA> <NA>
NA.1 <NA> <NA> <NA>
NA.2 <NA> <NA> <NA>
NA.3 <NA> <NA> <NA>
NA.4 <NA> <NA> <NA>
NA.5 <NA> <NA> <NA>
Is this possible to replace with dplyr package? I have tried mutate and transmutate both are not working. AnyIdea?
I also tried
data[is.na(data),c(grep("[Dd]ate|DOB",names(data)))]<-as.Date("1970-01-01")
Error in `[<-.data.frame`(`*tmp*`, is.na(data), c(grep("[Dd]ate|DOB", :
non-existent rows not allowed
traceback()
3: stop("non-existent rows not allowed")
2: `[<-.data.frame`(`*tmp*`, is.na(data), c(grep("[Dd]ate|DOB",
names(data))), value = 0)
1: `[<-`(`*tmp*`, is.na(data), c(grep("[Dd]ate|DOB", names(data))),
value = 0)
Is there any other approach?
Upvotes: 1
Views: 1600
Reputation: 887118
You could use base R
indx <- grep('DOB|Date', colnames(dat1))
dat1[indx][dat1[indx]=='1900-01-01'] <- NA
head(dat1)
# FirstTradeDate LastTradeDate DOB Val
#1 1900-01-03 <NA> 1900-01-04 0.8804658
#2 <NA> 1900-01-03 1900-01-05 -1.6254185
#3 1900-01-02 1900-01-02 <NA> -0.3863661
#4 1900-01-04 1900-01-05 <NA> 0.6196153
#5 <NA> 1900-01-03 <NA> -1.1075294
#6 1900-01-05 1900-01-04 1900-01-03 -0.9048846
Or using data.table
library(data.table)
DT <- setDT(dat1)[, lapply(.SD, function(x) replace(x,
x=='1900-01-01', NA)), .SDcols=indx]
Or using dplyr
library(dplyr)
dat1 %>%
mutate_each(funs(replace(., .=='1900-01-01', NA)),matches('Date'), DOB) %>%
head()
# FirstTradeDate LastTradeDate DOB Val
#1 1900-01-03 <NA> 1900-01-04 0.8804658
#2 <NA> 1900-01-03 1900-01-05 -1.6254185
#3 1900-01-02 1900-01-02 <NA> -0.3863661
#4 1900-01-04 1900-01-05 <NA> 0.6196153
#5 <NA> 1900-01-03 <NA> -1.1075294
#6 1900-01-05 1900-01-04 1900-01-03 -0.9048846
The data.table
method can also be done in a loop
(as mentioned by @Arun)
nm1 <- names(dat1)[indx]
DT <- data.table(dat1)
for(nm in nm1){
setkeyv(DT, nm)
DT[J(as.Date('1900-01-01')), (nm):=NA]
}
Alternatively using for-loop
with set
:
require(data.table)
indx = grep("DOB|Date", names(dat1), value=TRUE)
setDT(dat1)
for (j in indx) {
thiscol = dat1[[j]]
set(dat1, i = which(thiscol == "1900-01-01"), j = j, value = NA)
}
Dates <- seq(as.Date('1900-01-01'), length.out=5, by='1 day')
set.seed(248)
dat1 <- data.frame(FirstTradeDate=sample(Dates,100,replace=TRUE),
LastTradeDate=sample(Dates,100, replace=TRUE), DOB=sample(Dates,100,
replace=TRUE), Val=rnorm(100))
Upvotes: 1