Reputation: 347
I'm trying to filter out some data. Say the columns contain a numeric value that if equal to zero in all columns must go. I've though about performing multiple matches with which as so
match1 <- match(which(storm$FATALITIES==0), which(storm$INJURIES==0))
match2 <- match(which(storm$CROPDMG==0), which(storm$CROPDMGEXP==0))
match3 <- match(which(storm$PROPDMG==0), which(storm$PROPDMGEXP==0))
match4 <- match(match1, match2)
matchF <- match(match4, match3)
but it clearly doesn't work since its giving a position given the last vector... the data looks something like this:
BGN_DATE STATE EVTYPE FATALITIES INJURIES PROPDMG PROPDMGEXP CROPDMG
1 4/18/1950 0:00:00 AL TORNADO 0 15 25.0 K 3
2 4/18/1950 0:00:00 AL TORNADO 0 0 0.0 K 0
3 2/20/1951 0:00:00 AL TORNADO 0 2 25.0 K 0
4 6/8/1951 0:00:00 AL TORNADO 0 2 0.0 K 0
5 11/15/1951 0:00:00 AL TORNADO 0 0 0.0 K 0
6 11/15/1951 0:00:00 AL TORNADO 1 6 2.5 K 0
7 11/16/1951 0:00:00 AL TORNADO 0 1 2.5 K 0
CROPDMGEXP LATITUDE LONGITUDE REFNUM
1 3040 8812 1
2 3042 8755 2
3 3340 8742 3
4 3458 8626 4
5 3412 8642 5
6 3450 8748 6
7 3405 8631 7
I'm interested in matching removing all entries that are 0 for INJURIES, FATALITIES, CROPDMG, PROPDMG (all of them simultaneously). I've already filtered out NA with complete.cases(). Thanks
Upvotes: 2
Views: 408
Reputation: 89107
Here are a couple ways. One interactive and very intuitive:
subset(storm, INJURIES != 0 |
FATALITIES != 0 |
CROPDMG != 0 |
PROPDMG != 0)
and one programmatic, hence more flexible/scalable:
fields <- c('INJURIES', 'FATALITIES', 'CROPDMG', 'PROPDMG')
keep <- rowSums(storm[fields] != 0) > 0
storm[keep, ]
Upvotes: 3