Maalthou
Maalthou

Reputation: 29

Using grep to index out rows in r

I am trying to filter out data by searching for a particular keyword in the comments. I'm working with the following:

  > resp4[c(12:15,155:165),]
      DstSource MinDist    A    B    C    TypeOth DEF
  17        PLG      10 0.80 0.10 0.10       <NA>   0
  18        OGT       0 0.70 0.10 0.20       COTE   0
  19        OGT      10 1.00 0.00 0.00       <NA>   0
  21        OGT       0 0.50 0.25 0.25       <NA>   0
  301       OGT       0 1.00 0.00 0.00       LAGU   0
  304       PLG       0 0.40 0.10 0.50 large gull   0
  306       OGT       0 0.90 0.10 0.00      terns   0
  309       OGT       0 0.80 0.20 0.00      terns   0
  311       OGT       0 0.70 0.30 0.00      terns   0
  312       OGT       0 1.00 0.00 0.00       LAGU   0
  314       OGT       0 1.00 0.00 0.00       LAGU   0
  315       OGT       0 0.50 0.50 0.00       LAGU   0
  316       OGT       0 1.00 0.00 0.00       LAGU   0
  317       OGT       0 1.00 0.00 0.00      terns   0
  319       PLG      10 0.95 0.05 0.00       <NA>   0

And I am trying to specifically remove rows where TypeOth either contains the phrase "tern" (and it appears both as singular and multiple, upper and lower case through the data frame) or matches the whole expression "COTE". I know that if I was looking specifically for these entries, I could use

 resp4 <- resp4[grep("tern|cote",resp4$TypeOth, ignore.case=T),] 

Can someone point me towards how to index out the rows that my grep statement returns? Whenever I try to convert it to a logical command, it returns an empty object. For example, the following code does not work.

 resp4 <- resp4[!(grep("tern|cote",resp4$TypeOth, ignore.case=T)),]  

Upvotes: 0

Views: 953

Answers (2)

schosse-sitzer
schosse-sitzer

Reputation: 460

Try

resp4 <- resp4[-grep("tern|cote",resp4$TypeOth, ignore.case = TRUE),]

Upvotes: 0

talat
talat

Reputation: 70286

Several options, among them:

resp4[grep("tern|cote",resp4$TypeOth, ignore.case = TRUE, invert = TRUE),]

Or

resp4[!grepl("tern|cote",resp4$TypeOth, ignore.case = TRUE),] 

To name two of them.

Another one,

resp4[-grep("tern|cote",resp4$TypeOth, ignore.case = TRUE),] 

works in this case but I wouldn't recommend. Why? Look what happens if no matches are found:

> resp4[-grep("XXXX",resp4$TypeOth, ignore.case = TRUE),] 
#[1] DstSource MinDist   A         B         C         TypeOth   DEF      
#<0 Zeilen> (oder row.names mit Länge 0)

The first two options are safer for such cases.

Upvotes: 3

Related Questions