Reputation: 29
I am trying to filter out data by searching for a particular keyword in the comments. I'm working with the following:
> resp4[c(12:15,155:165),]
DstSource MinDist A B C TypeOth DEF
17 PLG 10 0.80 0.10 0.10 <NA> 0
18 OGT 0 0.70 0.10 0.20 COTE 0
19 OGT 10 1.00 0.00 0.00 <NA> 0
21 OGT 0 0.50 0.25 0.25 <NA> 0
301 OGT 0 1.00 0.00 0.00 LAGU 0
304 PLG 0 0.40 0.10 0.50 large gull 0
306 OGT 0 0.90 0.10 0.00 terns 0
309 OGT 0 0.80 0.20 0.00 terns 0
311 OGT 0 0.70 0.30 0.00 terns 0
312 OGT 0 1.00 0.00 0.00 LAGU 0
314 OGT 0 1.00 0.00 0.00 LAGU 0
315 OGT 0 0.50 0.50 0.00 LAGU 0
316 OGT 0 1.00 0.00 0.00 LAGU 0
317 OGT 0 1.00 0.00 0.00 terns 0
319 PLG 10 0.95 0.05 0.00 <NA> 0
And I am trying to specifically remove rows where TypeOth
either contains the phrase "tern" (and it appears both as singular and multiple, upper and lower case through the data frame) or matches the whole expression "COTE". I know that if I was looking specifically for these entries, I could use
resp4 <- resp4[grep("tern|cote",resp4$TypeOth, ignore.case=T),]
Can someone point me towards how to index out the rows that my grep
statement returns? Whenever I try to convert it to a logical command, it returns an empty object. For example, the following code does not work.
resp4 <- resp4[!(grep("tern|cote",resp4$TypeOth, ignore.case=T)),]
Upvotes: 0
Views: 953
Reputation: 460
Try
resp4 <- resp4[-grep("tern|cote",resp4$TypeOth, ignore.case = TRUE),]
Upvotes: 0
Reputation: 70286
Several options, among them:
resp4[grep("tern|cote",resp4$TypeOth, ignore.case = TRUE, invert = TRUE),]
Or
resp4[!grepl("tern|cote",resp4$TypeOth, ignore.case = TRUE),]
To name two of them.
Another one,
resp4[-grep("tern|cote",resp4$TypeOth, ignore.case = TRUE),]
works in this case but I wouldn't recommend. Why? Look what happens if no matches are found:
> resp4[-grep("XXXX",resp4$TypeOth, ignore.case = TRUE),]
#[1] DstSource MinDist A B C TypeOth DEF
#<0 Zeilen> (oder row.names mit Länge 0)
The first two options are safer for such cases.
Upvotes: 3