Panchito
Panchito

Reputation: 347

Remove rows that match a value

I'm trying to filter out some data. Say the columns contain a numeric value that if equal to zero in all columns must go. I've though about performing multiple matches with which as so

match1 <- match(which(storm$FATALITIES==0), which(storm$INJURIES==0))
match2 <- match(which(storm$CROPDMG==0), which(storm$CROPDMGEXP==0))
match3 <- match(which(storm$PROPDMG==0), which(storm$PROPDMGEXP==0))
match4 <- match(match1, match2)
matchF <- match(match4, match3)

but it clearly doesn't work since its giving a position given the last vector... the data looks something like this:

             BGN_DATE STATE  EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG
1   4/18/1950 0:00:00    AL TORNADO          0       15    25.0          K       3
2   4/18/1950 0:00:00    AL TORNADO          0        0     0.0          K       0
3   2/20/1951 0:00:00    AL TORNADO          0        2    25.0          K       0
4    6/8/1951 0:00:00    AL TORNADO          0        2     0.0          K       0
5  11/15/1951 0:00:00    AL TORNADO          0        0     0.0          K       0
6  11/15/1951 0:00:00    AL TORNADO          1        6     2.5          K       0
7  11/16/1951 0:00:00    AL TORNADO          0        1     2.5          K       0
   CROPDMGEXP LATITUDE LONGITUDE REFNUM
1                 3040      8812      1
2                 3042      8755      2
3                 3340      8742      3
4                 3458      8626      4
5                 3412      8642      5
6                 3450      8748      6
7                 3405      8631      7

I'm interested in matching removing all entries that are 0 for INJURIES, FATALITIES, CROPDMG, PROPDMG (all of them simultaneously). I've already filtered out NA with complete.cases(). Thanks

Upvotes: 2

Views: 408

Answers (1)

flodel
flodel

Reputation: 89107

Here are a couple ways. One interactive and very intuitive:

subset(storm, INJURIES   != 0 |
              FATALITIES != 0 |
              CROPDMG    != 0 |
              PROPDMG    != 0)

and one programmatic, hence more flexible/scalable:

fields <- c('INJURIES', 'FATALITIES', 'CROPDMG', 'PROPDMG')
keep   <- rowSums(storm[fields] != 0) > 0
storm[keep, ]

Upvotes: 3

Related Questions