Reputation: 319
I have a dataframe with this structure :
Note.Reco Reason.Reco Suggestion.Reco Contact
9 absent tomorrow yes
8 tomorrow yes
8 present today no
5 yesterday no
I would like to delete from this dataframe all the rows which have an empty value.
The expected result :
Note.Reco Reason.Reco Suggestion.Reco Contact
9 absent tomorrow yes
8 present today no
I try with this r instruction :
IRC_DF[!(is.na(IRC_DF$Reason.Reco) | IRC_DF$Reason.Reco==" "), ]
But I get the same input dataframe
Any idea please?
thank you
Upvotes: 11
Views: 58438
Reputation: 31
I was hitting the same error when fitting training data to a single decision tree. But it got resolved once I remove the NA values from the raw data before splitting in training and test set. I guess it was a mismatch of data when we split and f fitting in model. some steps: 1: remove NA from other then predictor col. 2: Now split in training and test set. 3: Train model now and hope it fix error now.
Upvotes: 0
Reputation: 2043
Or use dplyr's filter
function.
filter(IRC_DF, !is.na(Reason.Reco) | Reason.Reco != "")
Upvotes: 8
Reputation: 887991
We need to change the syntax to
IRC_DF[!(!is.na(IRC_DF$Reason.Reco) & IRC_DF$Reason.Reco==""), ]
# Note.Reco Reason.Reco Suggestion.Reco Contact
#1 9 absent tomorrow yes
#3 8 present today no
If multiple columns have NA or blanks (""
), then
IRC_DF[Reduce(`&`, lapply(IRC_DF, function(x) !(is.na(x)|x==""))),]
IRC_DF <- structure(list(Note.Reco = c(9L, 8L, 8L, 5L), Reason.Reco = c("absent",
"", "present", ""), Suggestion.Reco = c("tomorrow", "tomorrow",
"today", "yesterday"), Contact = c("yes", "yes", "no", "no")), .Names = c("Note.Reco",
"Reason.Reco", "Suggestion.Reco", "Contact"), class = "data.frame", row.names = c(NA,
-4L))
Upvotes: 12