Reputation: 2065
I have a data set U1 over which I run a classifier and get a vector of labels
pred.U1.nb.c <- predict(NB.C, U1[,2:6])
table(pred.U1.nb.c)
pred.U1.nb.c
S unlabeled
148 5852
> head(pred.U1.nb.c)
[1] S S S S S S
Levels: S unlabeled
Now I want to pull out those rows of U1 which were classified as S in U1.S. What is the most efficient way to do this?
Upvotes: 4
Views: 2914
Reputation: 263331
The answer by James has elegant economy going for it and would certainly work correctly with this example, but it is prone to undesirable results if the tested vector has any NA's. (I have been bitten many times and been puzzled.) Here are two safer ways that avoid the NA -inclusive behavior of the "[" function:
U1[which(pred.U1.nb.c=="S"), ]
This converts the logical vector (possibly with NA's) into a numerical vector with no NA's. Can also use subset:
subset(U1 ,pred.U1.nb.c=="S")
EDIT: I suspect that using grepl would also avoid the NA concern. Perhaps:
U1[grepl("^S$", pred.U1.nb.c), ]
Upvotes: 11